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Abstract 


In  this  dissertation,  the  asynchronous  direct-sequence  code  division  multiple  access 
(CDMA)  communication  system  is  described  and  a  number  of  multiuser  detection 
approaches  are  proposed  that  improve  upon  the  performance  of  the  conventional  basesta- 
tion.  Both  coded  and  uncoded  systems  are  studied  for  nondispersive,  additive  white 
Gaussian  noise  (AWGN)  channels. 

For  the  uncoded  system  case,  the  multiuser  detection  techniques  that  have  already 
been  proposed  are  first  reviewed.  Then,  two  decision  feedback  equalizers  (DFE’s)  that 
have  been  proposed  are  combined  to  form  a  new  hybrid  DFE  which  outperforms  the  oth¬ 
ers  in  situations  where  the  multiuser  interference  in  the  system  is  high. 

Next,  the  case  where  each  user  in  the  system  employs  a  convolutional  code  to 
improve  its  performance  is  studied.  First,  the  optimal  multiuser  sequence  estimator  is 
formulated,  and  it  is  shown  that  this  receiver  may  be  implemented  using  a  Viterbi  algo¬ 
rithm  which  operates  on  a  time-varying  trellis  with  a  number  of  states  which  is  exponen¬ 
tial  in  the  product  of  the  number  of  users  in  the  system  and  the  constraint  length  of  the 
codes  used  (for  the  rate- 1/2  code  case).  Because  this  optimal  receiver  has  a  very  high 
complexity,  a  variety  of  suboptimum  receivers  are  proposed  which  have  a  performance 
level  near  that  of  the  optimal  receiver’s  but  have  a  more  manageable  complexity.  All  of 
the  approaches  are  compared  on  the  basis  of  their  performance  (through  analysis  and 
simulation),  their  complexity  and  their  decoding  delay. 
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Chapter  1  Introduction 

Multiple  access  communication  systems  are  systems  in  which  several  (or  many) 
users  share  a  common  communication  channel  of  some  kind.  Generally,  in  these  types  of 
systems  there  are  many  more  potential  users  than  there  are  channel  resources  to  accom¬ 
modate  them  all  at  the  same  time.  As  a  result,  it  is  not  possible  to  dedicate  a  fraction  of 
the  chaimel  resource  to  every  potential  user.  Fortunately,  the  users  of  this  kind  of  ty^stem 
usually  need  to  transmit  bursty  information  messages.  As  a  result,  chatmel  resources 
may  be  allocated  to  only  those  users  which  ate  active. 

There  are  a  number  of  methods  that  have  been  proposed  over  the  years  to  allow  the 
active  users  to  share  the  channel  resources,  or  available  bandwidth.  Some  of  these 
methods  requite  that  the  users  tightly  coordinate  their  transmissions  with  each  otiier  in 
some  fashion,  while  other  methods  requite  much  less  coordination.  All  may  be  inter¬ 
preted  as  ways  of  having  the  active  users  coexist  in  the  frequency  and  time  space  of  the 
chatmel  with  an  acceptable  level  of  mutual  interference.  [68] 

The  least  coordinated  method  of  achieving  multiple  access  communications  is  to 
have  any  user  that  needs  to  transmit  do  so  using  the  entire  channel  while  monitoring 
whether  there  was  a  message  collision  with  any  otiier  user.  This  technique  is  often 
referred  to  as  carrier  sense  multiple  access  with  collision  detection  (CSMA/CD).  If  the 
transmitting  user  senses  a  collision,  it  will  adhere  to  the  rules  of  a  well  defined  protocol 
to  resolve  the  collision.  This  multiple  access  method  requires  no  central  controller  to 
dictate  when  the  users  must  transmit  With  this  method,  if  there  are  a  significant  number 
of  active  users,  then  there  are  many  collisions  and  a  great  deal  of  the  channel  resources  is 
wasted  in  resolving  the  collisions.  This  is  a  price  paid  for  the  lack  of  centralized  control. 
CSMA/CD  is  most  appropriate  for  systems  which  caimot  tolerate  any  interference  and 
throughput  can  be  sacrificed  for  performance.  CSMA/CD  has  traditionally  found 
widespread  application  in  computer  networks  which  are  bursty,  are  not  overly  congested. 
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8nd  r6(juire  th&t  there  are  very  few  bit  errors.  For  inore  congested  systems  which  can 
tolerate  higher  error  rates,  such  as  a  cellular  telephone  system  in  a  city,  CSMA/CD  is  not 
appropriate.  Figure  1.1  illustrates  a  cellular  communication  system  with  basestations  at 
the  center  of  each  cell  and  a  number  of  mobile  active  users  in  the  cell. 

A  more  coordinated  method  of  achieving  multiple  access  is  to  have  a  central  con¬ 
troller  dictate  which  fraction  of  the  time  a  particular  active  user  may  use.  In  this  method, 
called  time  division  multiple  access  (TDMA),  each  active  user  transmits  its  message  dur¬ 
ing  its  assigned  time  slot  in  a  round-robin  fashion,  thus  the  active  users  transmit  at  dif¬ 
ferent  times  on  the  same  frequency.  When  an  active  user  finishes  its  message,  it  notifies 
the  central  controller  and  its  time  slot  is  reallocated  to  another  active  user. 

A  similar  method  is  called  frequency  division  multiple  access  (FDMA).  In  this 
method,  the  central  controller  dynamically  allocates  frequency  slots  to  each  of  the  active 
users.  Thus,  in  FDMA,  the  active  users  transmit  at  the  same  time  on  different  frequen¬ 
cies.  When  an  active  user  completes  its  transmission,  it  notifies  the  central  controller  and 
its  frequency  slot  is  given  away  to  another  active  user.  One  advantage  of  FDMA  over 
TDMA  is  that  the  users  do  not  need  to  be  synchronized  with  each  other  in  time.  Both 
TDMA  and  FDMA  can  achieve  a  higher  capacity  than  CSMA/CD,  however,  the  prices 
paid  for  this  capacity  increase  some  interference  between  die  active  users  which  share  the 
channel,  the  need  for  a  central  controller  and  the  added  delay  associated  with  the  process 
of  requesting  the  charmel  resources  firom  the  central  controller. 

A  fourth  method  for  achieving  multiple  access  is  called  code  division  multiple 
access  (CDMA).  In  CDMA  systems,  the  active  users  transmit  at  the  same  time  on  the 
same  frequency  and  the  way  in  which  the  users  can  be  distinguished  or  addressed  is 
through  the  use  of  a  code  which  is  impressed  on  each  user’s  signal.  In  some  CDMA  sys¬ 
tems,  a  central  controller  will  allocate  the  codes  to  each  user.  In  other  CDMA  systems, 
each  user  will  be  assigned  a  permanent  code  sequence  and  there  will  be  no  need  for  a 
central  controller.  Ihis  may  be  appropriate  in  a  multipoint-to-multipoint  system  with  a 
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Figure  1 . 1  Illustration  of  a  cellular  communication  system.  Each  basestation  communicates  wifli  the  users  in  its  cell. 
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modest  number  of  users.  In  this  dissertation,  we  will  primarily  concern  ourselves  with 
cellular-type  systems  where  the  basestation  of  each  cell  acts  as  a  central  controller  for  its 
cell. 

Because  the  active  users  transmit  at  the  same  time  on  the  same  frequency  in 
CDMA,  they  ate,  in  a  sense,  continuously  colliding  with  each  other.  The  key  difference 
between  the  collisions  of  CDMA  and  CSMA/CD  is  that  the  codes  which  are  impressed 
on  the  CDMA  signals  minimize  the  effect  of  the  collisions,  while  in  CSMA/CD  the  colli¬ 
sions  are  not  tolerated  at  all.  The  higher  that  the  quality  of  the  codes  is,  the  lower  the 
mutual  interference  between  the  active  users  will  be  in  a  CDMA  system. 

A  heated  debate  has  erupted  in  the  cellular  communications  industry  over  the  past 
few  years  over  the  relative  capacities  of  CDMA,  TDMA  and  FDMA.  Proponents  of  each 
method  tend  to  distort  the  capacity  calculations  in  favor  of  their  favorite  method.  In  this 
dissertation,  we  will  not  consider  a  comparison  of  the  relative  capacities,  but  will  instead 
focus  on  CDMA  and  study  methods  of  detection  which  ultimately  lead  to  a  large  capacity 
increase  for  CDMA  over  the  traditional  methods  of  CDMA  detection. 

Two  undisputed  advantages  of  CDMA  over  TDMA  and  FDMA  are  its  soft  perfor¬ 
mance  degradation  with  the  number  of  users,  and  its  lack  of  a  need  for  any  kind  of  time 
or  frequency  coordination  between  the  active  users.  As  the  number  of  active  users 
increases  in  a  CDMA  network,  the  interference  for  each  of  the  active  users  increases. 
This  results  in  a  slow  degradation  of  the  performance  of  every  user  as  the  congestion  in 
the  network  increases.  In  contrast,  once  all  of  the  time  or  frequency  slots  are  accounted 
for  in  TDMA  or  FDMA,  the  network  is  full.  If  slots  are  empty  in  TDMA  or  FDMA,  then 
some  of  the  channel  resources  are  going  to  waste.  Additionally,  CDMA  does  not  require 
that  the  active  users  coordinate  their  transmissions  in  time  or  frequency  as  in  TDMA  or 
FDMA.  These  uncoordinated  CDMA  networks  are  called  asynchronous  networks.  Some 
other  important  advantages  of  CDMA  over  FDMA  and  TDMA  are  its  robustness  to  nar¬ 
rowband  fading  and  jammers,  its  ability  to  operate  in  the  background  noise  of  frequency 
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bands  that  are  occupied  by  narrowband  users  and  its  inherent  privacy. 

The  most  common  form  of  CDMA  in  commercial  applications  is  direct  sequence 
CDMA.  In  this  form  of  CDMA,  each  active  user  is  assigned  a  different  code  sequence, 
or  signature  sequence,  and  this  high-rate  sequence  modulates  the  data  for  that  user  before 
it  is  transmitted.  Because  the  signature  sequence  is  a  higher  rate  sequence  than  the  data 
sequence,  the  effect  of  this  modulation  of  the  two  signals  is  to  spread  the  spectrum  of  the 
transmitted  waveform  to  a  bandwidth  related  to  the  signature  sequence  rate.  Thus,  the 
direct  sequence  modulation  method  is  a  form  of  spread  spectrum  communications. 

The  receiver  operating  in  this  environment  receives  a  signal  which  is  the  sum  of  all 
of  the  active  user’s  transmitted  signals  plus  noise,  and  the  receiver’s  job  is  to  reliably 
decode  the  signal  of  interest  from  this  received  composite  signal.  The  users  are  not  syn¬ 
chronized  in  general,  and  in  addition,  the  received  signal  strengths  of  each  user  are  ^i- 
cally  unequal.  In  an  attempt  to  improve  the  performance  of  each  link,  error  control  cod¬ 
ing  may  be  used  on  each  of  the  links  as  well.  The  receiver  structures  that  will  be  studied 
in  this  dissertation  are  most  appropriate  for  a  basestation  in  a  cellular  telephone  cell  or 
personal  communication  network  (PCN)  cell  It  is  also  possible,  however,  that  the 
receiver  architectures  that  will  be  discussed  could  be  one  of  the  user’s  receivers  in  a 
decentralized  multiple  access  network. 

The  traditional  method  of  coherently  demodulating  direct  sequence  CDMA  signals 
is  to  synchronize  a  local  code  generator  and  oscillator  to  the  signal  of  interest  and  then  to 
make  decisions  on  the  received  signal  as  though  the  desired  signal  is  the  only  one 
present  The  received  signal  usually  consists  of  the  desired  signal,  a  multiuser  interfer¬ 
ence  (MUI)  signal,  thermal/shot  noise,  and  may  be  further  degraded  by  channel  time- 
dispersion.  The  traditional  decoder’s  structure  is  that  of  a  correlator  or  matched  filter 
which  is  matched  to  the  desired  signal  followed  by  a  decoder  if  coding  is  used  on  the 
link.  Figure  1.2  illustrates  this  conventional  receiver  or  basestation.  (The  notation  used 
in  this  figure  will  be  defined  in  Chapter  2.) 
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Hgure  IJ2  R>user  asynchronous  network  witii  a  conventional  basestation  that 
decodes  eadti  user  independently. 
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The  performance  of  the  traditional  decoder  suffers  for  two  major  reasons.  First,  the 
signature  sequences  of  the  different  users  will  usually  not  be  orthogonal  to  each  other, 
giving  rise  to  the  MUI,  and  second,  in  the  common  situation  where  all  of  the  signals 
arriving  at  the  receiver  are  of  different  strengths,  the  strong  signals  tend  to  overwhelm 
the  weak  signals,  even  with  reasonably  good  signature  sequences.  This  second  problem 
is  referred  to  as  the  near— far  problem. 

There  are  two  traditional  methods  for  improving  the  performance  of  the  conven¬ 
tional  receiver.  The  first  is  to  find  an  improved  set  of  signature  sequences  which  have  as 
high  a  degree  of  orthogonality  as  possible.  The  effectiveness  of  this  approach  is  limited 
by  the  Welch  inner  product  bound,  which  defines  the  lowest  achievable  maximum 
crosscorrelation  between  asynchronous  signature  sequences  of  a  given  length.  The  set  of 
binary  Kasami  sequences  achieves  the  Welch  bound,  although  this  set  of  sequences  is 
unfortunately  rather  small.  The  set  of  binary  Gold  sequences  is  a  much  larger  set,  which 
comes  close  to  the  Welch  bound.  Thus  there  is  not  much  to  be  gained  by  attacking  the 
problem  in  this  way.  [64] 

The  second  traditional  method  for  improving  the  conventional  receiver’s  perfor¬ 
mance  is  to  implement  a  power  control  scheme,  wherein  each  user’s  transmitted  power  is 
adjusted  so  that  its  received  signal  power  at  the  basestation  is  the  same  as  that  of  all  of 
the  other  users’  signals.  It  will  be  seen  later  in  this  dissertation  that  this  approach  is  a 
solution  to  the  near-far  problem,  but  it  is  a  conservative  and  somewhat  inefficient  solu¬ 
tion. 

A  major  improvement  over  the  traditional  receiver  can  be  achieved  by  viewing  the 
MUI  not  as  a  random  noise  signal,  but  instead  as  a  structured  interferer.  Because  all  of 
the  signals  making  up  the  MUI  in  a  CDMA  network  are  generally  of  the  same  structure 
as  the  signal  of  interest,  and  because  their  signature  sequences  are  generally  known  to  the 
receiving  system,  it  is  possible  to  augment  the  standard  receiver  structure  and  exploit  this 
knowledge  of  the  MUI.  This  can  be  done  by  estimating  MUI  and  attempting  to  cancel  it. 
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or  by  jointly  estimating  the  entire  message.  Figure  1.3  illustrates  this  kind  of  receiver, 
which  is  generally  referred  to  as  a  multiuser  receiver.  The  augmentation  required  con¬ 
sists  of  some  additional  synchronization  circuitry  to  lock  into  some  or  all  of  the  interfer¬ 
ing  signals,  and  then  a  decoding  algorithm  which  would  use  these  additional  statistics  to 
estimate  the  MUI  and  cancel  it 

The  rationale  for  using  this  augmented  receiver  is  that  if  it  is  successful  in  estimat¬ 
ing  the  MUI,  it  will,  in  many  situations,  be  able  to  eliminate  the  near-far  problem  and 
eiTor-rate  floor,  and  its  performance  will  be  approximately  that  of  a  single-user 
link.  The  drawback  of  this  approach  is  the  complexity  associated  with  the  additional 
synchronization  circuits  and  the  algorithm  for  estimating  and  eliminating  the  MUI.  It  is 
important  to  note  that  in  multipoint-to-point  networks  this  additional  synchronization  cir¬ 
cuitry  must  be  a  part  of  a  conventional  basestation  anyway,  as  a  basestation  must  lock  to 
and  demodulate  the  signals  of  all  users  in  the  cell  served  by  that  basestation.  Thus,  in 
certain  applications,  the  additional  complexity  of  jointly  decoding  the  signals  in  the  sys¬ 
tem  will  not  be  as  great  as  in  others,  such  as  a  single  user’s  receiver  in  a  multipoint-to- 
multipoint  network. 

It  is  important  to  note  that  this  technique  is  not  appropriate  in  a  jamming  environ¬ 
ment  where  the  interfering  signal  structure  is  not  known.  It  is  also  worth  keeping  in  mind 
that  if  the  MUI  becomes  too  severe,  the  main  limitation  of  the  CDMA  system  may  be  the 
synchronization  of  the  basestation  to  each  of  the  user’s  signal.  If  the  MUI  is  strong 
enough  to  prevent  the  basestation  from  acquiring  the  component  signals,  then  no  form  of 
coherent  detection  will  be  possible,  conventional  or  otherwise.  Finally,  if  the  MUI  is  so 
weak  that  the  users  do  not  degrade  each  other’s  performance,  then  the  performance  of  the 
conventional  basestation  will  be  essentially  optimum.  Thus  the  multiuser  detection  tech¬ 
niques  described  in  this  dissertation  are  aimed  at  the  cases  where  at  least  some  of  the 
users  in  the  system  suffer  in  performance  due  to  MUI,  but  the  MUI  is  not  so  strong  as  to 
prevent  acquisition  of  the  signals  at  all. 


Users  in  Cell 


Multiuser  Basestation 


Figure  1.3  K-user  asyndironous  network  with  a  multiuser  recdver  which 
jointly  decodes  all  of  the  users  in  the  system. 
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There  has  been  a  large  amount  of  interest  recently  in  the  design  of  multiuser 
receivers  for  CDMA  systems.  Most  of  this  work  has  centered  on  uncoded  links,  [1]  - 
[44].  References  [3]  and  [50]  are  particularly  good  tutorial  papers  on  this  subject  Only 
recently,  [41],  [43],  [44]  [53],  has  the  problem  of  multiuser  detection  of  coded  links  been 
considered. 

In  this  dissertation,  multiuser  receivers  will  be  examined  for  both  coded  and 
uncoded  CDMA  systems.  We  will  begin  by  studying  the  notion  of  multiuser  detection  in 
Chapter  2  by  examining  some  of  the  important  multiuser  receivers  that  have  already  been 
proposed.  This  theme  will  continue  into  Chapter  3  where  we  will  take  a  detailed  look  at 
the  decision  feedback  multiuser  detection  techniques  for  uncoded  links  which  have 
already  been  proposed.  This  discussion  will  lead  to  a  new  hybrid  decision  feedback 
equalizer  which  provides  superior  performance  to  those  that  have  already  been  proposed. 

Error  control  coding  is  a  traditional  tool  for  improving  the  reliability  of  communica¬ 
tion  systems.  As  a  result.  Chapter  4  will  begin  our  look  at  CDMA  links  where  each  user 
employs  a  convolutional  code  to  improve  performance.  In  this  chapter,  the  optimal 
sequence  estimator  will  be  formulated  and  its  performance  will  be  analyzed  both  through 
an  analytical  analysis  and  through  computer  simulations.  We  will  see  that  the  optimal 
sequence  estimator  provides  a  benchmark  for  all  other  multiuser  receivers,  as  it  is  the 
best  we  can  achieve  in  terms  of  sequence  error  probability.  The  probtem  with  this 
optimal  receiver  is  that  it  has  a  prohibitively  high  complexity. 

As  a  result  of  the  optimal  receiver’s  high  complexity,  in  Chapter  5  we  will  examine 
a  large  number  of  suboptimum  multiuser  receiver  architectures.  The  goal  in  studying  tiie 
suboptimum  approaches  is  to  find  a  receiver  that  maintains  most  of  the  optimal  receiver’s 
high  performance,  while  doing  so  witii  a  much  lower  complexity.  These  receivers  will 
be  studied  analytically,  whenever  possible,  and  using  computer  simulations  when  an 
analysis  is  not  possible.  A  performance  measure  will  be  introduced  called  the  asymptotic 
multiuser  coding  gain  (AMCG),  which  will  be  used  extensively  to  compare  the  various 
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receiver’s  performances.  We  will  see  from  this  performance  analysis  that  many  of  the 
suboptimum  approaches  do  achieve  nearly-optimum  performance  with  a  low  complexity. 

In  order  to  discuss  the  multiuser  receivers  in  Chapters  3,  4  and  5,  however,  it  is 
necessary  to  lay  out  the  notation  and  to  define  the  various  approaches  that  have  been  pro¬ 
posed  by  other  researchers  in  the  past.  This  notation  and  background  will  be  the  subject 
of  the  next  chapter. 
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Chapter  2  General  Multiuser  Detection 

In  this  chapter,  we  will  begin  by  outlining  die  notation  which  will  be  used 
throughout  the  dissertation.  This  will  lead  into  a  precise  formulation  of  the  problem 
which  multiuser  detection  seeks  to  solve.  A  summary  of  the  various  multiuser  receivers 
that  have  been  proposed  for  uncoded  links  will  then  be  given  in  order  to  provide  die 
necessary  background  for  the  following  chapters.  Finally,  a  brief  introduction  to  the 
topic  of  multiuser  detection  for  coded  links  will  be  given  to  motivate  the  work  in 
Chapters  4  and  S. 

It  will  be  assumed  that  the  CDMA  system  has  K  users  operating  simultaneously  on 
a  common  frequency  in  an  asynchronous  fashion.  Furthermore,  each  user  may  employ 
binary  convolutional  coding  on  its  link.  While  it  is  quite  conceivable  that  block  codes 
could  be  used  effectively  on  a  CDMA  link,  convolutional  codes  have  the  advantage  that 
they  operate  in  a  sequential  fashion.  Because  the  decoders  that  will  be  studied  in  this 
work  are  sequential  in  nature,  the  convolutional  codes  are  a  much  better  match  to  the 
decoders  than  block  codes.  Also,  in  [38]  it  was  shown  that  in  CDMA  systems,  binary 
convolutional  codes  often  outperform  more  general  trellis  codes  which  map  information 
symbols  onto  M-level  signals  where  M  is  larger  than  the  alphabet  size  of  the  information 
symbols.  In  other  words,  there  is  no  particular  advantage  to  using  nonbinary  coding. 
This  may  be  considered  a  further  justification  for  the  confinement  in  scope  of  this  work  to 
binary  convolutional  codes.  One  further  assumption  in  titis  work  is  that  each  user 
employs  the  same  convolutional  code,  although  it  is  not  at  all  difficult  to  generalize  this 
work  to  the  case  where  each  user  employs  a  different  code. 

At  each  time  interval,  n,  of  length  T^,  the  convolutional  code  is  generated  for  user  k 
by  passing  P  binary  information  bits,  /jt(n)  =  through  a  shift  register 

consisting  of  W  stages  with  Q  modulo-2  adders,  as  shown  in  Figure  2.1.  The  number  of 
output  bits  for  each  P-bit  input  sequence  is  Q  bits.  The  rate  of  the  code  is 


encoded  sequence  to  modulator 


Figure  2.1  General  convolutional  encoder  structure.  [64] 
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the  constraint  length  of  the  code  is  W.  The  output  sequence  of  binary  code  bits  for  the 

interval  corresponding  to  input  bits  4(n)  is  (Z>]k^^(n) . Djp^(n)).  Note  that  for  W  =  1 

and  P  =  G  =  1,  we  have  the  uncoded  case,  so  in  that  case  Djfc(n)  =  /tCn). 

In  the  time  interval  [nr,-K^-l)r4^jfe,nr,+5r+Xjfc),  user  k  transmits  data  bit  D]jp{n\ 
where  %  represents  the  time  shift  of  the  user  relative  to  some  reference  time,  thus 
accounting  for  the  asynchronism  of  the  users  relative  to  each  other.  T  represents  the  code 
bit  period  and  Tb-T/Rc  is  the  information  bit  duration,  thus  Ts  =  QT-PTb.  Let 
T*  =  mjfcr-Kjfe,  Xjfe  e  [0,r),  and  nii  e  {0,...,Q-1}.  Thus  m^T  is  a  coarse  time  shift  and  x* 
is  a  fine  time  shift  for  user  k. 


Each  user  in  the  system  is  assigned  a  particular  signature  sequence,  and  it  will  be 
assumed  that  this  signature  sequence  has  a  duration  equal  to  the  code  bit  interval, 
although  this  assumption  can  be  relaxed  with  a  change  of  the  notation.  We  will  combine 
the  carrier  and  signature  sequence  into  a  single  signal,  thus  the  k^  carrier  multiplied  by 
the  binary  (±1)  signature  sequence,  will  be  denoted  by 


Skit)  = 


PNkit)  cos  {diet) 


O^t^T 

otherwise 


(2.1) 


We  will  assume  that  cOcT  is  an  integer  multiple  of  27C  to  provide  phase  continuity  at  the 
code  bit  boundaries.  Note  that  Skit)  is  a  unit-energy  waveform.  The  energy  of  the  fe"' 
user’s  code  bit  measured  at  the  receiver  will  be  denoted  by  P*.  It  will  be  assumed  that  all 
K  users  transmit  their  signals  through  a  common  additive  white  Gaussian  noise  chaimel 
with  two-sided  noise  spectral  density  Nq/2,  and  so  the  received  signal  will  have  the  fol¬ 
lowing  form 

rit)=  X  X  §:Di^\n)^Skit-nTs-iq-l)T-<k)  +  zit)  (2.2) 

n=s~oei^r:l  ^=1 

where  z(t)  denotes  the  noise.  If  there  is  no  coding  on  the  link,  Dkin)-Ikin)  and 
Ts  =  Tb-  T,  so  (2.2)  may  be  rewritten  in  a  simpler  form. 
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r(()=  i  XDii(n)^st(t-nT-^)  +  z(t)  .  (2.3) 

n=>-oo  k-1 

Because  it  is  more  notationally  cumbersome  to  discuss  the  coded  link  case,  we  will  use 
the  uncoded  link  case  for  the  remainder  of  this  chapter  and  the  next  to  introduce  the  sys¬ 
tem  model  and  some  of  the  receivers  diat  have  been  proposed  in  the  literature.  As  a 
result,  for  the  remainder  of  this  chapter  and  Chapter  3,  we  will  use  equation  (2.3)  to 
represent  the  received  signal.  In  Chapters  4  and  5,  where  the  coded  link  case  will  be  dis¬ 
cussed  in  detail,  we  will  resort  to  the  use  of  equation  (2.2)  to  represent  the  received  sig¬ 
nal. 

Next  we  define  the  partial  cross -correlation  of  the  known  signature  sequences  of 
users  j  and  k  to  be: 

CO 

PjkiO-  j  dt  .  (2.4) 

It  is  worth  noting  that  p jj(0) = 1  and  p jk(l) = Pkj(rO- 

We  will  assume  that  the  front  end  of  the  receiver  consists  of  a  bank  of  K  matched 
filters  or  correlators,  each  matched  to  one  of  the  transmitted  waveforms  in  the  system. 
(Note  that  Figures  1.2  and  1.3  illustrate  a  correlator  implementation  of  the  matched  filter 
bank.)  It  was  shown  in  [1]  that  the  complete  set  of  matched  filter  ouq)uts  generates 
sufficient  statistics  for  the  demodulation  of  each  user’s  data.  In  Chapter  4,  we  will  not 
make  this  assumption  about  the  front  end,  but  will  ultimately  arrive  at  the  result  that  the 
optimal  sequence  estimator  may  be  implemented  with  the  matched  filter  bank  front  end. 
The  ouqrut  of  the  filter  matched  to  the  Jfc*  signal  at  time  (n  +1)7 +Tjfe  is 

(»+i)r+ii 

rk(n)=  j  r(t)Skit-nT-^k)dt  (2.5) 

where  perfect  synchronization  has  been  assumed  here  between  the  component  of  the 
received  signal  and  the  local  signature  sequence  generator  at  the  receiver. 


16 


Substituting  for  r(f)  in  equation  (2.3)  and  integrating,  we  obtain 
K  _  K 

rk{n)=  X 

7^+1  7=1 

+  2  Pifc7(l)^7(«  ^^kin)  (2.6) 

i=l 

Note  that  because  the  system  is  asynchronous,  these  matched  filter  outputs  become  avail¬ 
able  at  different  times  for  each  user  and  interval.  Figure  2.3  illustrates  a  time  line  for 
each  user  in  the  system.  Without  a  loss  in  generality,  we  assume  that  the  users  are 
ordered  according  to  increasing  tjt. 

This  set  of  K  equations  for  the  K  received  signals  can  be  denoted  in  the  following 
compact  matrix  form: 

^(n)=£[p(-l)D(n-l)  +  p(0)D(n)  +  p(l)D(«+l)]  +  Z(n)  (2.7) 

Here  p(m)  is  the  signature  correlation  matrix  for  lag  m,  which  is  KxK.  Eisi^t  KxK 
diagonal  energy  matrix  with  the  diagonal  term  being  Diri)  is  the  1  data 
vector  holding  each  user’s  independent  and  identically  distributed  data  at  time  n,  and 
Z(n)  is  a  IT  X 1  vector  of  noise  variates  which  are  Gaussian  random  variables  colored 
both  spatially  and  temporally  as  shown  below.  The  noise  variables  have  zero  mean  and 
covariance  matrix 

C(m)=^-p(m)  (2.8) 

for  the  m**  time  lag.  These  matrices  ax&KxKsX  each  time  lag.  Nq/I  is  the  two-sided 
noise  spectral  density. 

For  the  case  where  there  are  two  users  in  the  system  these  matrix  equations  take  the 


following  form: 
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Figure  2.2  Discrete  time  vector  model  of  the  multiuser  CDMA  chaimel. 
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Figure  2.3  Time  lines  for  each  of  the  K  users  in  an  asynchronous  CDMA  system. 
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Zi(n) 


Figure  2.4  Discrete  time  model  of  a  2-user  CDMA  channel. 
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(2.9) 


and  the  covariance  matrix  will  be 


C(m)  = 


2 


6(m)  Pi2(’-l)5(m-l)  +  pi2(0)5(w) 

Pi2(0)5(m)  +  p2i  (l)6(m+l)  8(m) 


(2.10) 


The  matrix  p(l)  wUl  be  lower  triangular  in  general,  and  the  matrix  p(-l)  will  be  upper 
triangular  in  general. 

As  we  noted  from  definition  (2.4),  Pi2(~l)=P2i(l).  This  implies  that  in  the  general 
K  user  case,  p(-l)  =  p^(l)  and  furthermore  that  p(0)  is  a  Hermitian  matrix.  Also,  it  is 
important  to  note  that  p(i)  =  0  when  lil  >  1  due  to  the  one  symbol  interval  support  of  die 
waveform  and  in  addition,  that  the  noise  sequence  within  a  given  channel 

{^(”)}  »=>-»•  is  temporally  white. 

At  this  point,  the  continuous-time  CDMA  waveform  chaimel  has  been  cast  into  the 
form  of  a  discrete-time  vector  channel.  The  equivalent  vector  model  is  shown  in  Figure 
2.2  in  the  z-domain  for  simplicity.  In  this  figure,  H{z)  is  the  chaimel  transfer  function 
matrix,  and  A(z)  is  the  transfer  function  matrix  of  a  filter  which  properly  colors  the  noise. 
The  receiver’s  job  might  be  to  observe  the  sequence  of  matched  filter  outputs  and  to 
make  an  estimate  of  the  entire  data  sequence  D  in  the  uncoded  case  and  /  in  the  coded 
case,  or  it  might  be  to  perform  some  form  of  symbol-by-symbol  detection,  R{n)  consists 
of  a  filtered  version  of  D(n)  in  noise  which  is  both  spatially  and  temporally  colored 
according  to  C(m)  =  p(m)iVo/2.  Due  to  the  necessary  synchronization  circuits  at  the 
front  end  of  the  receiver,  all  of  the  signal  cross-correlations  can  be  generated  at  the 
receiver  using  at  most  K{K-\-\)/2  correlators,  since  we  would  need  to  dedicate  at  most 
one  correlator  to  the  computation  of  Pjk{l)  for  each  j  and  k  pair.  (If  the  channel  transfer 
function  matrix  is  not  changing  quickly,  it  may  be  possible  to  generate  the  signal  cross- 
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correlations  using  a  much  smaller  number  of  correlators  in  a  time-sharing  fashion.)  As 
long  as  the  receiver’s  synchronization  circuits  ate  locked  to  the  appropriate  component  of 
the  received  signal,  the  normalized  transfer  function  of  the  channel,  p(z)  will  be  measur¬ 
able.  In  addition,  if  we  assume  that  the  receiver  has  gain  estimation  circuits  built  into  the 
synchronization  stage  of  the  receiver,  the  energy  matrix,  E  will  also  be  known.  Finally, 
the  correlation  matrix  of  the  noise  can  be  constructed  easily  from  the  locally  generated 
signal  cross-correlations.  Throughout  the  rest  of  this  work,  we  will  make  the  assumption 
that  the  energies  and  cross-correlations  of  all  of  the  users  have  been  estimated  perfectly, 
Le.  H  is  known. 

2.1  Possible  Receiver  Structures  for  the  Uncoded  Case 

The  receiver’s  goal  is  to  minimize  the  probability  of  error  for  the  channel  of 
interest.  The  traditional  receiver  for  the  user  will  simply  make  a  zero-threshold  com¬ 
parison  on  the  observed  statistic,  rjt(n),  at  each  symbol  interval,  n.  This  detection  stra¬ 
tegy  is  optimal  if  the  only  available  statistic  is  and  we  have  no  knowledge  of  the 
structure  of  the  interference,  or  if  the  signature  sequences  are  mutually  orthogonal,  which 
is  generally  not  the  situation  in  the  asynchronous  case.  If  all  of  the  other  received  statis¬ 
tics  are  available  as  well,  in  an  augmented  receiver  or  a  basestation  in  a  cellular  network, 
then  this  knowledge  may  be  used  to  perform  a  joint  estimation  of  the  sequences.  (See 
Figure  1.3)  This  kind  of  augmented  receiver  is  referred  to  in  the  literature  as  a  multiuser 
receiver. 

There  are  many  multiuser  receivers  that  have  been  proposed  in  the  past  decade.  (See 
[1]  -  [44])  These  approaches  can  be  broadly  grouped  together  in  three  main  categories: 
trellis  and  tree-based  approaches,  linear  equalizer  approaches  and  decision  feedback 
approaches.  There  are  many  many  multiuser  receivers  that  have  been  proposed  in  each 
of  these  categories.  Furthermore,  some  of  these  receivers  do  not  fit  cleanly  into  one  of 
these  categories,  but  instead  are  hybrids  of  two  of  the  approaches.  Rather  than  discuss 
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each  of  the  previously  proposed  receivers  individually,  we  will  discuss  the  three 
categories.  In  our  discussion  of  each  category,  we  will  cite  some  of  the  key  representa¬ 
tives  of  that  approach. 

2.1.1  Trellis/Tree-Based  Approaches 

The  broad  category  of  approaches  which  we  refer  to  as  trellis  and  tree-based  include 
all  decoders  which  operate  by  searching  a  either  a  trellis  or  a  tree  for  the  most  likely 
sequence  that  was  transmitted.  As  we  will  see,  this  search  will  result  in  the  maximum 
likelihood  sequence  estimate  of  the  transmitted  data  if  the  optimal  metric  is  used  and  the 
search  is  performed  with  the  Viterbi  algorithm  in  a  trellis  with  states  for  a  K  user 
system. 

Due  to  the  finite  alphabet  of  the  input  data  symbols  to  the  known  matrix  FIR  chan¬ 
nel  filter  H(n),  the  multiuser  detection  problem  may  be  considered  to  be  one  of  estimat¬ 
ing  the  inputs  to  a  finite-state  machine  in  colored  Gaussian  noise.  The  minimum  proba¬ 
bility  of  sequence  error  decoder  for  all  of  the  users  would  then  correspond  to  the  max¬ 
imum  likelihood  decoder  if  all  of  the  input  sequences  were  assumed  to  be  equally  likely. 
It  was  shown  in  [1]  that  the  maximum  likelihood  sequence  estimator  (MLSE)  can  take 
the  form  of  a  Viterbi  algorithm  which  traverses  K  stages  of  a  state  cyclically  time- 
varying  trellis  in  one  bit  period  of  the  individual  users  in  the  system.  The  optimal  metric 
at  stage  n  of  the  trellis  takes  the  form: 

l)  ~  ^B— l(^n— 1»®b-2)  ^b(^b»®b-i)  (2.11) 

where  D„  represents  all  of  the  data  bits  for  all  of  the  users  up  to  time  interval  n,  a„_i 
represents  the  state  at  the  end  of  stage  n  -1,  and 

=  Dp(,)(a(n))  [2rp(„)(a(n))-Dp(„)(a(/i))  V^p(„)  pp(«)P(«)(0) 

—2  D pQ-)((x,(j))  V^po')  Pp(")W)(®0)”®^(”))  ]  (2.12) 

where  P(n)  and  a(n)  are  the  resulting  components  of  a  modulo-ZiT  decomposition  of  the 
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integer  n,  i.e.  n  =  a(n)  +  P(n)  with  P(n)  < 

It  is  significant  that  the  crosscorrelation  Pp(B)p(,-)(otO*)~ot(n))  in  the  last  term  of  equa¬ 
tion  (2.12)  is  zero  for  \n-j\^K.  This  implies  that  the  limits  of  the  sum  can  be  reduced 
to  the  values  over  which  Pp(„)p(;)(aO)-«(^))  ^  nonzero,  namely  from  j=n-(A:-l)  to 
j=n-l.  The  number  of  terms  in  this  sum  of  the  final  term  and  correspondingly,  the 
number  of  past  data  bits  that  affect  the  stage  metric  is  K.—\.  Because  the  state  is  defined 
as  the  past  data  bits  affecting  the  stage  metric,  we  see  that  the  state  of  the  system  is  given 
by 

o„_i  =  [i>p(B-i)(ot(n-l)).  f>p(„_2)(a(ii-2)), . . . ,  I>p(n-x+i)(a(«-^+l))  ]  (2.13) 

and  since  each  data  bit  is  binary,  we  get  the  result  that  there  are  2^“^  states  that  are  pos¬ 
sible  for  (2.13). 

Because  the  metric  must  be  computed  for  both  of  the  paths  that  emanate  from  each 
of  the  2^“*  states  at  each  trellis  stage,  and  because  with  each  stage  one  bit  is  estimated, 
we  may  say  that  the  time-complexity  per  estimated  bit  is  TCB  =  0(2x2”/!)  = 
0(2^),  [1].  It  is  interesting  that  an  earlier  formulation  of  this  problem  used  the  philoso¬ 
phy  that  the  system  should  be  modeled  as  a  vector  state  machine,  [25].  This  approach 
implies  that  the  state  is  determined  by  the  K  bits  from  the  previous  interval,  n  -1,  and  the 
current  input  is  the  vector  of  the  K  bits,  one  for  each  user,  in  interval  n.  This  approach 
had  a  time-complexity  per  estimated  bit  of  TCB  =  0([2^2^yK)  =  0{2^/K).  The 
approach  in  [1]  which  does  not  view  the  problem  as  a  vector  problem  is  clearly  less  com¬ 
plex. 

The  obvious  drawback  of  the  Viterbi  algorithm  in  this  application  is  that  the  com¬ 
plexity  of  the  state  machine  is  exponential  in  K,  the  number  of  users  in  the  system,  and 
the  receiver  complexity  will  be  excessive  for  a  large  system.  It  is  dieiefore  interesting  to 
consider  suboptimum  solutions  which  are  less  complex  than  the  optimal  sequence  esti¬ 


mator. 
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If  the  search  is  somehow  performed  on  a  trellis  with  fewer  states  and  if  a  metric 
other  than  the  optimal  metric  is  used,  the  performance  will  be  worse  than  that  of  the 
optimal  MLSE  by  some  amount  The  goal  of  reduced  state  sequence  estimation  (RSSE) . 
is  to  operate  on  a  trellis  with  fewer  states  than  the  optimal  MLSE,  while  sdll  performing 
nearly  as  well  as  the  optimal  decoder.  The  key  advantage  of  RSSE  is  that  if  the  number 
of  states  is  reduced  significantly  from  then  the  complexity  savings  for  die  decoder 
will  be  large.  RSSE  was  introduced  in  [47]  and  [57]  -  [60]  for  the  intersymbol  interfer¬ 
ence  (ISI)  problem,  and  has  also  been  studied  for  use  with  trellis  codes  and  convolutional 
codes.  [47]  This  idea  has  recently  been  applied  to  the  simplification  of  the  optimal 
sequence  estimator  of  [1]  in  [8]  and  [56].  The  complexity  of  the  RSSE  decoder  will 
depend  on  how  much  the  optimal  trellis  of  2^“^  states  has  been  reduced.  The  perfor¬ 
mance  is  also  a  function  of  the  degree  of  reduction  that  is  performed. 

It  is  also  possible  to  view  the  problem  as  a  search  in  a  tree.  Sequential  decoding 
approaches  are  sparse  tree  search  procedures  which  are  useful  for  applications  like 
decoding  convolutional  codes  with  very  large  constraint  lengths  and  equalizing  ISI  when 
the  channel’s  impulse  response  is  long.  As  with  RSSE,  sequential  decoding  approaches 
provide  a  tradeoff  of  complexity  versus  performance.  Naturally,  these  approaches  can  be 
an  attractive  way  to  reduce  the  optimal  sequence  estimator’s  high  complexity.  In  [14],  a 
modified  Fano  metric  was  used  with  the  stack  algorithm  to  simplify  the  decoding  opera¬ 
tion  and  still  maintain  a  performance  level  near  fiiat  of  the  MLSE. 

2.1.2  Linear  Approaches 

The  idea  in  these  approaches  is  that  decision  statistics  will  be  formed  from  linear 
combinations  of  the  matched  filter  outputs,  or  more  generally  from  linear  operations  on 
the  received  signal.  Focusing,  for  now,  on  the  first  approach,  we  again  consider  the 
discrete  time  vector  model  of  the  channel  from  the  input  signals  to  the  matched  filter  out¬ 
puts  as  is  illustrated  in  Figure  2.5.  The  Z-transform  of  the  sequence  of  vector  of  matched 
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Rgure  2.5  Discrete  time  vector  model  of  the  multiuser  CDMA  channel  with  a  linear  receiver. 


Figure  2.6  Convolutionally  coded  direct  sequence  CDMA  link 

with  a  conventional  basestation  architecture  (case  £}  =  1, ;  €  { l,...,iir}  shown). 
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filter  outputs  is  denoted  by  ^(z).  This  sequence  of  received  vectors  is  passed  through  a 
linear  filter  with  transfer  function  matrix,  B(z).  The  resulting  decision  statistics  vectors 
are  then  compared  with  a  threshold  of  zero,  term  by  term,  to  produce  the  decision  vector, 
3(n),  in  the  n*  time  interval.  The  Z-transform  of  the  sequence  of  decision  vectors  is 
denoted  in  Figure  2.5  by  D(z). 

The  obvious  question  which  arises  with  this  approach  is  what  values  should  be  used 
in  the  £  matrix  to  minimi ze  the  error  probability  of  the  decisions.  This  problem,  unfor¬ 
tunately,  is  not  analytically  tractable,  and  so  alternate  performance  criteria  must  be 
adopted.  The  two  most  common  criteria  for  solving  this  problem  for  linear  equalizers  for 
single-user  systems  which  suffer  from  intersymbol  interference  (ISI)  are  the  minimmn 
mean  squared  error  criterion  (MMSE)  and  the  zero-forcing  criterion.  Both  of  these  cri¬ 
teria  have  been  applied  to  the  multiuser  detection  problem. 

In  [4]  and  also  essentially  in  [17],  the  decorrelator  receiver  was  proposed.  This 
receiver  is  the  multiuser  analog  of  the  zero-forcing  linear  equalizer  for  the  ISI  problem. 
The  solution  for  the  decorrelator  is  that  B(z)  =  or  in  words,  for  the  decorrelator 

the  linear  filter  is  chosen  to  invert  the  normalized  charmel  transfer  function  matrix. 

Because  this  receiver  will  form  a  portion  of  a  multiuser  receiver  which  will  be  pro¬ 
posed  in  Chapter  5,  we  will  review  the  basic  mathematics  behind  the  decorrelator.  Since 
the  Z-transform  of  the  sequence  of  received  vectors  is  given  by 

Riz)  =  piz)E(z)Diz)  +  Z(z)  (2.14) 

it  follows  that  the  decorrelator’s  decision  statistic  vector  sequence  will  have  Z-transform 
Biz)Riz)  =  p-HzMz)Eiz)D(z)  +  p-^z)Z(z)  =  E(z)5(z)  +  p  (z)Z(z)  (2.15) 
Let  the  decorrelator’s  ouq)ut  noise  vector  sequence  have  Z-transform 

Z(z)  =  p-Vz)Z(z)  (2.16) 

Equation  (2.15)  shows  that  the  decorrelator  has  eliminated  the  multiuser  interference 
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caused  by  p(z),  however,  in  general,  the  variance  of  Zk(n)  will  be  larger  than  that  of 
Zk(n).  izkin)  represents  the  noise  on  the  user’s  decision  statistic.)  Let  the  covari¬ 
ance  matrix  of  this  noise  be  denoted  by 

^jk(l)  =  E[zj(nMn-l)]  (2.17) 

The  Z-transform  of  the  covariance  matrix  sequence  may  be  written  as 

ZTmn)]=E[Z(z)Z%)]  =E{p-Hz)Ziz)¥iz)p-'^iz)]  =  p-^(z)No/2  (2.18) 

since  £[Z(z)Z^(z)]  =p(z)N(/2.  This  is  obtained  from  taking  the  Z-transform  of  equa¬ 
tion  (2.8).  Finally,  it  is  not  difficult  to  show  that  £[z;k(n)]=0,  and 
Var  [^(n)]  =  <!>kkiO)  No/2.  Because  <t>**(0)  will  be  greater  than  one,  the  noise  variance  is 
enhanced  by  some  amount.  Thus,  the  decorrelator  linearly  combines  the  matched  filter 
outputs  in  such  a  way  so  as  to  eliminate  the  MUI,  but  in  doing  so,  the  noise  variance  is 
increased.  This  is  the  same  effect  which  is  observed  for  the  zero-forcing  linear  equalizer 
for  the  ISI  problem. 

The  MMSE  linear  receiver  fiiat  was  discussed  in  [13],  [21]  and  [24].  The  transfer 
function  of  the  MMSE  receiver  turns  out  to  be 

i((z)  =  (p(2)+7W(/2)-^  (2.19) 

From  this  equation,  it  is  easy  to  see  that  this  receiver  reduces  to  the  decorrelator  when  the 
background  noise  in  the  multiuser  system  goes  to  zero,  No/2-^.  In  the  case  where  there 
is  background  noise,  the  MMSE  receiver  can  ouq)erform  the  decorrelator  because  it 
minimizes  the  variance  of  both  the  MUI  and  the  noise. 

Other  linear  receivers  have  also  been  introduced  in  the  literature,  see  for  example 
[4],  however  we  will  not  review  these  approaches  in  this  work.  The  interested  reader  is 
referred  to  the  references. 

One  final  thing  to  note  about  the  linear  approaches  is  that  they  generally  have  a 
linear  complexity  in  K.  The  size  of  the  receiver’s  matrix  filter,  B(fi)  \&KxK,so2K+l 
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new  FIR  filters  must  be  added  to  the  matrix  filter  when  user  K-\r\  enters  the  system.  The 
ideal  decorrelator  and  MMSE  FIR  matrix  filters  can  not  be  realized  with  a  finite  number 
of  taps,  however  the  filter  impulse  response  would  inevitably  be  truncated  at  some  depth, 
8.  It  therefore  follows  that  the  number  of  multiplications  which  are  necessary  for  each 
bit  decoded  is  roughly  0(5irV^)  =  0(5^D.  The  complexity  associated  with  the  tap 
computation  is  a  "one  time"  cost,  as  the  taps  only  need  to  be  computed  once  as  long  as 
the  channel  conditions  do  not  change.  However,  if  new  users  enter  and  leave  the  system 
often  or  if  p(z)  changes  with  time,  the  new  filter  taps  have  to  be  recomputed  often  and 
this  would  make  the  complexity  of  these  approaches  much  higher.  Furthermore,  taking 
die  inverse  of  a  polynomial  matrix  is  not  an  easy  task. 

2.1.3  Decision  Feedback  Approaches 

The  decision  feedback  approach  to  the  equalization  problem  is  not  linear,  but 
instead  involves  feeding  back  tentative  decisions  in  some  fashion  to  attempt  to  cancel  the 
multiuser  interference.  DFE  structures  were  studied  in  [6],  [7],  [12],  [13],  [20],  [22]  - 
[24].  In  the  literature,  some  of  these  approaches  are  called  multistage  receivers,  some  are 
called  successive  cancellation  receivers  and  some  are  simply  called  DFE’s.  Chapter  3 
will  present  a  number  of  the  various  approaches  in  a  unified  way  to  illustrate  their  com¬ 
monality  and  we  will  use  the  term  DFE  to  refer  to  this  class.  The  decision  feedback 
approach  will  be  discussed  in  detail  in  that  chapter  and  a  hybrid  DFE  will  be  developed 
which  ouqierforms  the  nonlinear  DFE’s  proposed  to  this  point.  These  DFE  approaches 
will  also  turn  out  to  have  a  linear  complexity  with  K. 

2.2  Possible  Receiver  Structures  for  the  Coded  Case 

After  we  discuss  DFE’s  in  Chapter  3,  we  will  move  on  to  consider  multiuser  detec¬ 
tion  for  coded  links.  Consider  for  a  moment  CDMA  links  where  each  user  employs  a 
convolutional  code.  The  basestation  that  will  be  referred  to  as  the  conventional 
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basestation  on  this  coded  link  attempts  to  estimate  the  user’s  data  using  only  the 
matched  filter  outputs  for  the  user,  (see  Figure  2.6).  It  will  be  assumed  that  the 
Viterbi  algorithm  operating  on  each  user’s  observed  code  symbols  is  a  soft-decision 
Viterbi  algorithm  having  a  decoding  delay  which  generally  will  be  several  times  the  con¬ 
straint  length,  W,  of  the  code.  The  time  complexity  per  decoded  bit  for  this  receiver  may 
be  estimated  by  considering  the  number  of  metric  computations  per  information  bit 
decided.  If  we  define  the  binary  memory  order  of  the  encoder  to  be  k  =  log2iS,  where  S  is 
the  number  of  states  of  each  user’s  encoder,  then  there  are  2*^^  metrics  computed  for 
every  P  bits  decided,  so  TCB  =  0  (2*^^/P). 

As  in  the  uncoded  case,  there  are  a  number  of  ways  that  a  multiuser  receiver  can 
operate  to  improve  upon  the  performance  of  the  conventional  basestation.  In  the  Chapter 
4,  the  optimum  sequence  estimator  will  be  derived  for  this  problem  and  analyzed. 
Because  this  receiver  has  a  very  high  complexity,  in  Chapter  5  a  broad  range  of  subop¬ 
timum  receivers  will  be  introduced  which  have  a  lower  complexity  than  the  optimal 
sequence  estimator  and  still  maintain  high  performance  levels  in  many  cases.  The  only 
work  that  has  appeared  in  the  literature  on  this  topic  at  this  time  are  [44],  [53],  [41]  and 
[43].  These  approaches  will  all  be  unified  in  Chapter  5  and  their  performance  will  be 
analyzed  whenever  possible. 

Before  moving  on  to  the  coded  case,  however,  we  will  consider  DFE’s  for  the 
uncoded  case  in  the  next  chapter.  These  DFE’s  which  will  be  examined  in  Chapter  3  will 
form  the  foundation  for  a  number  of  the  suboptimum  low-complexity  receivers  of 
Chapter  5. 
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Chapter  3  A  New  Nonlinear  Decision  Feedback  Equalizer 


To  begin  our  discussion  of  the  possible  DFE  structures,  we  will  focus  on  two  non¬ 
linear  multistage-style  DFE  algorithms  that  have  already  been  proposed.  By  comparing 
them,  we  will  be  led  to  the  hybrid  structure  which  is  the  real  focus  of  this  chapter.  We 
virill  see  that  the  new  hybrid  structure  greatly  outperforms  the  previous  DFE’s  in  most 
situations. 

The  multistage  algorithm  proposed  by  Varanasi  and  Aazhang  in  [6]  can  be  viewed 
as  a  form  of  a  DFE  that  is  appropriate  for  the  multiuser  problem,  although  it  has  a  dif¬ 
ferent  structure  from  the  DFE’s  used  in  the  ISI  problem.  In  [6],  it  was  assumed  that  die 
matched  filter  outputs  were  delayed  appropriately  so  that  they  all  became  available  to  the 
multistage  algorithm  at  the  same  time.  In  this  discussion,  we  will  present  die  algorithm 
in  an  equivalent  sequential  fashion  because  it  will  lead  naturally  to  the  DFE  proposed  in 
[13]  and  the  hybrid  DFE  to  be  proposed  in  Section  3.1.  For  the  purposes  of  this  discus¬ 
sion,  this  algorithm  will  be  referred  to  as  Varanasi’s  DFE  or  as  Varanasi’s  algoridim. 

In  Varanasi’s  multistage  algorithm,  a  hard  decision  is  made  on  each  matched  filter 
output  and  the  output  is  stored  in  a  buffer  as  is  illustrated  in  Figure  3.1.  This  buffer  has 
an  ouqiut  which  is  available  to  the  multistage  algorithms  operating  on  the  other 
sequences  of  matched  filter  outputs.  This  hard  decision  and  storage  forms  what  is 
referred  to  as  the  first-stage  of  the  algorithm.  The  second-stage  uses  a  delayed  matched 
filter  ouq>ut  and  the  tentative  first-stage  estimates  of  each  user’s  bits  to  obtain  a  better 
second-stage  estimate  of  the  appropriate  user’s  bit.  Because  the  actual  MUI  for  user  j  at 
time  interval  n  is  given  by 

MUIjin)  =  2:pyKl)A(«+l)>/^  + 

/=!  l*J 

i=j+i 


(3.1) 


STAGE  2 


Figure  3.1  A  3-stage  version  of  the  multistage  algorithm  proposed  by  Varanasi  and  Aazhang  in  [6]. 
Iterative  estimation  of  user  j's  bit  sequence  shown. 


STAGE 3 
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the  idea  is  to  use  first-stage  estimates  of  each  of  the  Di(k)  values  in  (3.1)  to  estimate  this 
MUI.  The  values  of  Pji(k)  are  again  obtained  by  cross-correlating  the  local  signal  gen¬ 
erators  at  the  front  end  of  the  receiver  which  are  synchronized  to  the  actual  signal  of 
interest,  and  the  values  of  are  obtained  by  estimating  the  signal  strength  of  each 
user. 

In  the  second-stage,  the  estimated  MUI  is  subtracted  from  the  delayed  matched  filter 
output  and  the  resulting  real  number  is  compared  with  a  zero  threshold.  The  result  is 
placed  into  the  appropriate  position  in  the  second-stage  buffer  and  again  this  buffer  posi¬ 
tion  is  accessible  for  each  of  the  other  multistage  algorithms  which  operate  on  the  other 
users  matched  filter  output  sequences.  This  procedure  can  be  performed  as  many  times 
as  is  desired,  but  there  is  a  number  of  stages  beyond  which  very  little  additional  gain  is 
achieved  by  adding  stages.  [12] 

A  second  DFE  structure  was  proposed  by  Xie,  Short  and  Rushforth  in  [13],  and  for 
the  purposes  of  this  discussion,  this  structure  will  be  referred  to  as  Xie’s  DFE  or  Xie’s 
algorithm.  Xie’s  DFE  was  not  proposed  as  a  multistage  algorithm,  but  instead  as  a  vari¬ 
ant  of  the  more  traditional  ISI-style  DFE  structure.  This  algorithm  can  also  be  viewed  as 
a  two-stage  multistage  algorithm,  however.  It  is  in  this  spirit  that  it  is  presented  in  Figure 
3.2  so  that  it  can  be  easily  compared  with  the  multistage  algorithm  of  Figure  3.1. 

The  first-stage  of  this  DFE  is  again  a  conventional  detector  which  simply  makes 
hard  decisions  on  the  matched  filter  outputs  and  stores  the  estimated  bits  in  a  buffer. 
This  section  of  the  DFE  was  viewed  by  Xie  et  al  as  a  nonlinear  feedforward  portion  and 
the  second-stage  was  viewed  as  a  nonlinear  feedback  portion.  This  view  cormects  it  with 
the  ISI  equalizers  which  have  linear  feedforward  portions  followed  by  nonlinear  feed¬ 
back  portion. 

The  MUI  which  corrupts  each  matched  filter  output  is  described  by  equation  (3.1) 
and  may  be  broken  up  into  two  parts.  One  part  consists  of  bits  which  trail  the  bit  being 
decided  in  time  and  have  already  been  estimated  at  the  second-stage  level.  This  portion  is 
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Hgure  3^  The  decision  feedback  equalizer  proposed  by  Xie,  Short  and  Ru^orth  in  [13]  shown  in  the 
form  of  a  2-stage  multistage  algorithm.  Estimation  of  user  j ’s  bit  sequence  is  shown. 
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called  the  past  portion  of  the  MUI,  and  for  user  j  at  time  interval  n  -1  it  can  be  written  as 

SPj/(0)D/(/i-1)^+  Z  pji(-i)Di(n-2)-Wi'  (3.2) 

/=!  l=j+l 

The  second  portion  of  the  MUI  consists  of  bits  which  have  not  yet  been  estimated  by  the 
second-stage  and  sO  first-stage  estimates  of  these  bits  must  be  used.  The  future  portion 
of  the  MUI  for  user  j  at  time  interval  n  -1  can  be  expressed  as 

FMt//j(n-l)='£p;/(l)D/(n)V^+  Z  Pjl(0)Dl(n-l)^|Ei'  (3.3) 

/=!  l=j+l 

which  is  the  portion  of  (3.1)  which  is  not  included  in  (3.2). 

It  should  be  apparent  at  this  point  that  die  process  carried  out  by  this  DFE  can  be 
extended  easily  to  more  stages  by  using  the  second-stage  estimates  of  the  data  as  bits  in 
the  future  portion  of  a  third-stage  MUI  estimate.  This  may  in  some  cases  provide  better 
data  estimates  than  are  available  from  a  two-stage  equalizer. 

A  major  difference  between  the  two  multistage  algorit^ns  is  that  Varanasi’s  DFE 
uses  exclusively  first-stage  data  estimates  in  its  second-stage  MUI  estimate,  while  Xie’s 
DFE  uses  as  many  of  the  more  reliable  second-stage  data  estimates  as  are  available  at  the 
decision  time  for  the  7*  user’s  data.  It  is  thus  expected  that  Xie’s  DFE  will  t5rpically 
outperform  Varanasi’s  two-stage  multistage  algorithm. 

A  major  drawback  of  both  algorithms,  however,  is  that  they  both  rely  upon  a  con¬ 
ventional  detector  as  their  first-stage.  It  will  be  shown  in  the  next  section,  that  in  a  severe 
MUI  environment  where  either  the  cross-correlation  between  adjacent  users  is  very  high, 
or  there  are  enough  moderately  cross-correlated  users  to  produce  the  situation  where  the 
MUI  term  is  larger  than  the  desired  signal  term,  this  conventional  first-stage  badly  limits 
the  performance  of  these  equalizers.  It  is  for  this  reason  that  we  propose  a  hybrid  equal¬ 
izer  which  does  not  use  a  conventional  first-stage.  This  hybrid  decoder  will  be  shown  to 
perform  better  in  the  severe  MUI  environment  than  the  other  DFE’s  examined  so  far 
because  it  will  use  a  better  first-stage.  This  decoder  and  a  variety  of  simulation  results 
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comparing  all  of  these  DFE  structures  are  the  subjects  of  the  next  two  sections. 


3.1  The  Hybrid  DFE 

Figure  3.3  illustrates  two  stages  of  an  alternative  multistage  DFE.  The  first-stage  is 
not  a  conventional  detector  which  simply  makes  hard  decisions  on  the  matched  filter  out¬ 
puts,  but  is  instead  a  nonlinear  decision  feedback  equalizer  itself.  Due  to  the  need  for 
causality  of  the  equalizer,  however,  not  all  of  the  MUI  can  be  estimated  at  the  first-stage, 
and  so  only  the  past  portion  of  the  MUI  can  be  reconstructed.  In  this  way,  tihe  first-stage 
makes  an  attempt  to  estimate  the  past  half  of  the  MUI  in  the  first-stage  and  subtracts  the 
influence  of  those  data  bits  from  the  appropriate  matched  filter  output.  Thus  the  first- 
stage  data  estimate  for  the  user’s  data  bit  will  be  the  following: 

Df\n)  =  sgn[  Tjin)  -  PMUlf\n)  ]  (3.4) 

where  sgn  [•]  is  the  signum  function  which  performs  the  zero-threshold  comparison  and 

*— 1  K  ^ 

PMt/4-^\n)=  £p//(0)Di%)  ^  (3.5) 

/=!  l=j+l 

The  subsequent  stages  operate  in  the  same  way  as  the  second-stage  of  Xie’s  DFE.  Here 
because  of  the  delay  imposed,  the  n  -1^  data  bit  will  be  estimated  at  time  n  by: 

Df\n -1)  =  sgn[  rjin-1)  - PMUlf\n-l)  - FMUlf\n-l)  ]  (3.6) 

where 

‘—1  K  A 

PMUlf\n  -1)  =  £  p jiiO)  D?\n  -1)  ^  +  £  p;/(-l)  3P^(«  “2)  (3.7) 

i=i  i=j+i 

FMUlf^{n-\)=Y^Pji{l)bf\n)^+  £  Pj7(0) DP(n-l)  ^  (3.8) 

i=i  /=7+i 

Note  fliat  if  the  feedback  is  correct  and  the  energies  and  crosscorrelations  are  perfectly 
estimated,  the  MUI  will  be  perfectly  cancelled.  If  the  feedback  is  incorrect  for  a  particu¬ 
lar  bit,  then  that  bit’s  contribution  to  the  interference  will  effectively  be  doubled. 
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To  Subsequent  Stages .... 


Figure  3.3  A  hybrid  multistage  algorithm  with  2  stages  shown  for  estimating  usa  j’s  bit  sequence. 
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It  is  assumed  in  the  algorithm  above  that  the  delay  between  each  of  the  user’s  data 
bit  intervals  (ie.  for  each  J)  is  large  enough  to  allow  the  computation  of  equa¬ 

tions  (3.4)  and  (3.6)  for  the  ;-l"  user  so  that  the  results  can  be  stored  in  buffers  before 
they  are  accessed  by  the  user’s  multistage  algorithm.  To  insure  that  the  proper  data 
bit  is  used  in  the  computation  of  the  MUI,  a  status  bit  could  be  used  to  indicate  whether 
the  data  residing  in  the  latch  holding  the  data  bit  of  interest  is  the  desired  data  bit  or  that 
from  one  symbol  period  earlier,  which  would  be  the  case  if  the  computation  has  not  yet 
been  completed  for  user  ;-l.  The  user’s  multistage  algorithm  should  then  wait  until 
the  status  bit  for  the  ;-l"  user’s  data  shows  that  the  data  is  ready  before  computing  the 
MUI  estimate  for  the  user  at  that  stage.  In  this  way,  assuming  that  the  computation  of 
each  user’s  data  bit  at  each  stage  takes  less  than  T/K  seconds,  where  T  is  the  bit  period 
for  each  user  and  iT  is  the  number  of  users,  the  DFE  will  be  able  to  operate  in  real  time. 

This  hybrid  algorithm  combines  the  best  parts  of  each  DFE,  namely  Varanasi’s  mul¬ 
tistage  flexibiUty,  and  Xie’s  breaking  up  of  the  MUI  into  future  and  past  portions  which 
aUows  the  past  half  of  the  MUI  to  be  estimated  by  more  reUable  stage  data  estimates 
at  the  stage.  The  distinguishing  feature  of  the  hybrid  multistage  algorithm  is  the 
first-stage  which  attempts  to  cancel  half  of  the  MUI  terms  as  opposed  to  the  conventional 
first-stages  of  both  of  the  other  equalizers. 

3.2  Simulation  Results 

In  this  work,  the  various  receivers  were  simulated  for  a  four-user  system.  Tbe 
values  of  the  cross-correlations  were  exaggerated  to  simulate  a  highly  bandwidth 
efficient  CDMA  network.  Figure  3.4  shows  two  sets  of  signature  sequences  for  a  four- 
user  system  (the  complex  envelopes  of  the  signals  arc  shown  assuming  a  earner  phase  of 
zero).  The  resulting  crosscorrelation  structure  is  also  shown  in  this  figure.  There  arc 
many  signature  sets  and  delays,  in  general,  which  yield  the  same  crosscorrelation  struc¬ 
ture,  and  so  the  signature  sets  which  arc  shown  arc  provided  only  to  illustrate  that  the  set 
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Figure  3.4a  An  example  set  of  signature  waveforms  and  delays  which  yield  the  correlation  structure  shown. 
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Figure  3.4b  A  second  set  of  signature  waveforms  and  delays  which  yield  the  more  severe  correlation  structure  show 
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of  crosscoirelations  simulated  are  achievable.  (It  is  possible  to  construct  correlation 
matrices  which  are  not  in  the  range  of  achievable  values.)  Both  signature  sets  shown  are 
constructed  to  provide  a  high  level  of  MUI  which  is  fairly  balanced  between  the  users. 
This  choice  of  signature  waveforms  does  not  necessarily  represent  an  intelligent  choice 
of  waveforms  for  this  particular  chip-to-symbol  rate  ratio.  It  is  also  probably  safe  to  say 
that  if  the  MUI  was  much  mote  severe  dian  that  simulated  in  these  examples,  then  die 
dominating  problem  would  most  likely  be  that  of  the  initial  acquisition  of  the  local  osdl- 
lators  and  code  sequence  generators  at  the  front  end  of  the  receiver  rather  than  the  perfor¬ 
mance  of  the  multiuser  detectors  if  synchronization  is  achieved. 

The  emulations  were  performed  in  the  MATLAB  environment  using  the  Monte 
Carlo  siT»"latinn  technique  and  twenty  thousand  transmitted  bits  for  each  user.  The 
transmitted  bits  were  a  random  sequence  of  binomial  random  variables  with  a  probability 
of  each  value  being  a  half  (note  that  sending  the  all  +l’s  sequence  or  all  -I’s  sequence 
does  not  produce  realistic  MUI  for  the  other  users).  The  noises  were  colored  properly  by 
generating  multidimensional  scaled  Gaussian  random  variables  for  each  subinterval  of 
the  bit  period  delineated  by  Ihe  dotted  lines  in  Figure  3.4,  and  then  projecting  the  random 
variables  onto  the  subsections  of  each  waveform  and  finally  summing  up  the  projections. 
It  was  verified  that  this  method  yielded  properly  colored  noise. 

Figure  3.5  illustrates  the  performance  of  the  various  DFE  structures,  and  the  max¬ 
imum  likelihood  sequence  estimator  (MLSE)  for  the  channel  of  Figure  3.4a.  In  this 
graph,  all  four  users  have  the  same  energy  level  and  the  performance  shown  is  the  aver¬ 
age  of  all  of  the  user’s  performances.  Also,  the  performance  of  a  single-user  system  is 
shown.  As  expected.  Xie’s  DFE  sUghdy  outperforms  the  second-stage  of  Varanasi’s 
multistage  algorithm  due  to  the  decomposition  of  the  MUI  into  future  and  past  parts.  The 
the  hybrid  algorithm  is  able  to  outperform  both  of  the  other  DFE  approaches. 

Figure  3.6  shows  the  performance  of  user  1  in  a  case  where  Ej/Ei  =  10  dB  for 
;  =2,3,4  and  the  channel  of  Figure  3.4b.  In  this  case  the  three-stage  hybrid  algorithm 
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Figure  3.5  Performance  curves  for  a  single-user  system,  a  conventional  CDMA  receiver,  the 
Varanasi  multistage  algorithm,  the  Xie  DFE,  the  optimal  MLSE  and  the  hybrid  algorithm  in  the 
4-user  channel  shown  in  Figure  3.4a.  Varanasi’s  Multistage  and  Xie’s  DFE  are  shown  as  dotted 
lines,  the  single-user  system  and  the  Verdu  MLSE  are  shown  as  solid  lines,  and  the  hybrid  DFE  is 
shown  with  a  dashed  line.  All  users  have  the  same  energy. 


Figure  3.6  Performance  curves  for  a  single-user  system,  a  conventional  CDMA  receiver,  the 
Varanasi  multistage  algorithm,  the  Xie  DFE,  the  MLSE  and  the  hybrid  algorithm  for  the  4-user 
channel  of  Figure  3.4b  with  Ej/E\  =  lOdB  (for  all  Varanasi’s  Multistage  and  the  Xie  DFE 
are  shown  as  doited  lines,  the  MLSE  is  shown  as  a  solid  line  and  the  hybrid  DFE  is  shown  with  a 
dashed  line. 
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nearly  perfectly  cancels  the  MUI  while  the  other  multistage  algorithms  are  hardly  able  to 
perform  better  than  the  conventional  decoder. 

Figure  3.7  shows  the  performance  of  user  1  for  the  channel  of  Figure  3.4b  as  the 
energy  df  the  interfering  users  is  varied  and  E\/Nq  is  fixed  at  4  dB.  The  interesting 
feature  of  the  graph  is  that  bofii  the  optimal  sequence  estimator  and  the  two  and  three- 
stage  hybrid  DFE’s  are  resistant  to  the  near-far  problem  on  this  channel,  while  the 
Varanasi  and  Xie  DFE’s  are  not  This  implies  that  for  this  case,  the  MLSE  and  hybrid 
DFE  achieves  the  single  user  performance  level  for  sufficiently  weak  and  sufficiently 
strong  interference  while  the  Varanasi  and  Xie  DFE  structures  only  achieve  it  for 
sufficiently  weak  interference.  This  is  due  to  the  fact  that  as  the  energy  of  users  2,3  and  4 
becomes  large  with  respect  to  the  energy  of  user  1  and  the  noise,  user  2’s  and  4’s  conven¬ 
tional  first-stage  has  a  decision  statistic  which  is  dominated  by  MUI  (the  MUI  term  is 
larger  than  the  desired  signal  term).  Hius  as  Ej/E\—^  and  correspondingly 
Ej/NQ--^J=2,3,4j  the  probability  of  error  for  users  2  and  4  does  not  approach  zero  and 
so  there  will  always  be  some  interference  doubling  for  user  I’s  decision  statistic  due  to 
incorrect  feedback.  (Recall  that  when  the  feedback  is  incorrect,  the  interference  is  dou¬ 
bled,  rather  than  being  cancelled.)  Because  in  this  regime,  any  incorrect  feed¬ 

back  of  one  bit  decision  will  lead  to  an  error  with  probability  1/2.  This  implies  that  the 
probability  of  error  for  user  1  must  be  greater  than  the  the  probability  of  error  that  user  1 
would  have  in  the  absence  of  MUI. 

In  the  less  severe  charmel  of  Figure  3.4a,  the  Varanasi  and  Xie  versions  of  the  DFE 
are  able  to  acheive  the  single-user  performance  level  for  both  sufficienfiy  strong  and 
.  sufficiently  weak  interference.  This  is  illustrated  in  Figure  3.8. 

The  bottom  lint*,  in  aU  of  these  simulations  is  that  the  hybrid  algorithm  outperforms 
the  other  algorithms  in  situations  where  the  conventional  first-stages  of  the  Xie  and 
Varanasi  DFE’s  perform  badly.  In  addition,  the  performance  of  the  hybrid  DFE  is  near 
that  of  the  optimal  MLSE  over  a  wider  range  of  MUI  levels  than  for  the  other  DFE’s. 
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Figure  3.7  Probability  of  bit  error  for  user  one  in  a  4-user  system  with  Eb/No=  4  dB  for  user  one 
and  the  p  =0.25  channel  of  Figure  3.4b  plotted  versus  the  energy  ratio. 
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Because  the  complexity  of  the  hybrid  DFE  is  not  much  greater  than  that  of  the  other 
DFE’s  (one  additional  "past  MUI  estimator"  and  an  adder  are  necessary  for  each  user), 
and  because  the  the  typical  desired  operating  region  for  a  CDMA  network  is  that  with 
many  users,  and  thus  severe  MUI,  the  hybrid  DFE  is  an  attractive  approach.  A  nice 
feature  of  all  of  the  DFE’s  discussed  here  is  that  they  may  be  implemented  in  a  parallel 
fashion  easily,  since  it  is  possible  to  use  multiple  processors  with  a  common  memory  to 
leaUze  the  algorithms.  This  feature  will  allow  the  DFE’s  to  be  appUed  to  systems  with 
many  users. 

33  Extensions 

The  two  nnniinftar  DFE’s  that  have  been  proposed  for  CDMA  networks  led  to  the 
development  of  a  hybrid  DFE  which  outperformed  the  others  in  all  of  the  situations 
simulated.  We  considered  only  nonlinear  DFE’s,  although  it  is  also  possible  to  combine 
these  DFE’s  with  linear  equalizers  to  form  a  variety  of  additional  approaches.  In  [7]  and 
[28],  the  idea  of  replacing  the  conventional  first-stage  of  the  Varanasi  DFE  with  a 
decorrelating  filter  was  considered,  and  significant  performance  gains  were  achieved.  In 
[22]  -  [24],  among  other  ideas,  the  idea  of  replacing  the  conventional  first-stage  of  die 
Xie  DFE  with  a  decorrelator  was  considered  along  with  some  modifications  to  the  way  in 
which  second-stage  decisions  were  formed.  These  approaches  carry  witii  them  the  added 
complexity  of  the  decorrelator,  but  they  suggest  an  obvious  extension  to  the  hybrid  DFE, 
namely,  that  a  linear  section  could  be  added  to  the  first-stage  of  the  hybrid  multistage 
DFE.  This  linear  section  would  attempt  to  eliminate  the  future  MUI  from  the  first- 
stage’s  decision  statistic  either  by  an  inverse  filtering  approach  Uke  that  in  the  DFE  in 
[22]  and  [23],  or  by  adopting  the  minimum  mean  squared  error  criterion  for  the  future 
MUI  ranceiiarinn  and  using  an  LMS  or  RLS  approach  to  adapt  the  feedforward  matrix 
filter  taps.  Related  ideas  were  examined  in  [24]. 
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In  [22]  -[24],  the  idea  of  ordering  the  users  according  to  their  energies  rather  than 
their  delays  was  considered.  This  modification  implies  that  the  future  MUI  would  take 
on  the  meaning  of  weaker  MUI  and  die  "past  MUI"  would  take  on  the  meaning  of 
stronger  MUL  It  is  possible  that  this  modification,  when  applied  to  the  hybrid  would  pro¬ 
vide  better  performance.  This  modification  would  also  mean  that  the  standard  hybrid’s 
first-stage  would  be  roughly  equivalent  to  the  successive  cancellation  approaches  in  [34]. 
The  term  successive  cancellation  refers  to  approaches  wherein  the  strongest  user’s  signal 
is  decoded  first  and  then  remodulated  and  subtracted  from  the  next  strongest  user  s  signal 
and  so  on.  Thus  the  hybrid  with  the  users  ordered  according  to  energy  is  a  multistage 
generalization  of  the  successive  cancellation  schemes. 

Another  variation  of  the  basic  Varanasi  DFE  is  considered  in  [28].  In  this  paper, 
soft  decision  approaches  are  employed  such  as  the  use  of  a  linear  clipper  in  place  of  the 
threshold  detector  in  each  stage.  These  modifications  are  shown  in  [28]  to  provide  gains 
in  terms  of  the  asymptotic  multiuser  efficiency  for  the  two-user  case.  It  is  quite  reason¬ 
able  to  expect  that  if  these  modifications  are  applied  to  the  Xie  DFE  or  the  hybrid  DFE, 
similar  gains  would  be  obtained. 

Thus  in  this  chapter  we  have  formulated  a  hybrid  DFE  which  is  capable  of  incor¬ 
porating  the  best  features  of  virtually  every  other  DFE  that  has  been  proposed.  This 
hybrid  DFE  shows  that  the  previously  proposed  approaches  have  features  which  are  not 
mutually  exclusive,  but  instead,  their  features  may  be  combined  into  a  common  architec¬ 
ture  which  ouq)erfonns  aU  of  the  others. 

In  chapter  5,  multiuser  receivers  which  are  appropriate  for  a  convolutionally  coded 
CDMA  system  are  proposed  which  are  based  on  the  various  DFE’s  of  this  chapter.  In 
that  chapter,  we  will  see  that  the  DFE  category  of  multiuser  receiver  is  again  an  attractive 
approach,  in  that  it  has  a  high  performance  level  and  a  moderately  low  complexity. 
Before  examining  DFE’s  for  coded  links,  however,  we  will  first  want  to  consider  tiie 
optimum  sequence  estimator  in  the  next  chapter  to  provide  a  baseline  for  the  best  that  the 


47 


suboptimum  approaches  of  chapter  5  can  hope  to  achieve. 


48 


Chapter  4  The  Multiuser  ML  Sequence  Estimator  for  Convolutionally 
Coded  CDMA  Links 

Now  that,  among  other  things,  we  have  reviewed  the  various  types  of  multiuser 
receivers  that  have  been  proposed  for  uncoded  CDMA  systems  with  additive  white  Gaus- 
rian  noise,  we  are  prepared  to  begin  a  study  of  the  possible  multiuser  receiver  approaches 
for  coded  linVg  We  will  begin  this  study  by  formulating  the  optimal  sequence  estimator 
in  this  chapter.  An  asymptotic  analysis  and  computer  simulations  will  be  used  to  show 
diat  this  receiver  is  able  to  significantly  outperform  the  conventional  receiver  which  was 
illustrated  in  Figure  2.6.  Furthermore,  we  will  see  that  the  ML  sequence  estimator  is  able 
to  achieve  a  single-user  performance  level  in  many  situations. 

4.1  Optimum  Sequence  Estimator  For  Rate-1/2  Convolutional  Codes 

The  optimal  MLSE  will  now  be  derived  for  the  special  case  in  which  each  user  in 
the  network  is  employing  a  rate-1/2  convolutional  code  with  a  constraint  length  W  (or 
memory-order  W-1),  so  T,  =7^  =27.  Our  limitation  to  this  special  case  will  facilitate 
considerably  the  derivation  of  the  decoder,  and  it  will  then  be  outlined  how  the  optimal 
decoder  can  be  derived  in  a  rimilar  way  for  a  general  tsX&-P/Q  convolutional  code  case. 

To  begin,  it  is  important  to  note  that  the  optimal  sequence  estimator  or  equalizer  for 
multiple-user  uncoded  signals  operates  in  a  "round-robin"  fashion  among  all  K  users  in 
the  system,  [1].  This  Viterbi  algorithm  traverses  one  trellis  stage  per  channel  bit  observed 
per  user.  The  optimal  sequence  estimator  for  decoding  the  rate- 1/2  code  for  one  of  the 
users  in  a  single  user  environment,  however,  is  a  Viterbi  algorithm  which  requires  two 
channel  observations  from  the  user  to  move  ahead  one  stage  in  the  trellis.  [62]  The  fact 
tiiat  the  equalizer  and  decoder  operate  in  a  fundamentally  different  way  suggests  that  a 
slightly  different  view  of  the  problem  is  required.  The  following  view  of  the  problem 
will  be  adopted  in  order  to  bypass  this  issue.  The  rate-1/2  convolutional  code  can  be 
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viewed  not  as  a  code  which  produces  two  binary  bits  per  information  bit  period,  Tb,  but 
as  an  equivalent  trellis  code  which  produces  one  4-ary  coded  waveform  every  Tb 
seconds.  By  formulating  the  equtdization  problem  at  the  receiver  with  respect  to  this 
super-code-symbol  view  of  the  received  signal,  we  can  accomplish  both  the  tasks  of 
equalization  and  decoding  in  the  same  Viterbi  algorithm.  Because  there  is  only  one  4-ary 
super-symbol  received  for  each  information  bit  that  must  be  decided,  the  decoder  can  be 
formulated  in  basically  the  same  fashion  as  was  used  in  [1]  for  the  MUI  problem  or  [61] 
for  the  ISI  problem. 

We  begin  by  defining  the  following  notation. 

^i(0  = 


1  t€[0,D 
0  otherwise 


g2it)  = 


1  te[T,  2T) 
0  otherwise 


(4.2) 


Next,  the  following  function  may  be  defined 

where,  as  in  Chapter  2,  D^\n)  is  the  user’s  code  bit  ^  in  the  time  interval  n.  Two  of 
the  signature-carrier  waveforms  can  likewise  be  concatenated  to  form  a  super-signature 
waveform. 

Jjt(r -nTfc-tjfc)  =  Sk{t-nTb-^k)  +  Sk(t-in  +V2)Tb-^k)  (4-4) 

This  again  presumes  that  the  signature  sequence  repeats  every  code  symbol  period.  The 
received  waveform  may  now  be  written  in  terms  of  these  waveforms  of  duration  Tb'. 

r(/)=  i  ZDk(t^Tb^k)Ut-nTb-^k)^+z(t)  (4.5) 

B=— jt=l 

where  z(t)  is  the  additive  white  Gaussian  noise  with  a  two-sided  spectral  density  of 
No/2.  This  signal  may  be  viewed  as  a  four-valued  super-code  symbol, 
modulating  a  pair  of  orthonormal  basis  functions  through  the 
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procedure  defined  above.  The  basis  functions  in  this  new  view  of  the  waveform  are 

and 

appropriately  synchronized  with  the  information  bit  periods.  Thus,  this  equivalent  view 
of  the  coding  process  suggests  that  the  information  bits  are  mapped  by  the  encoder  onto 
waveforms  in  a  space  defined  by  <|>iifc(t)  nnd  <|)2fc(t)*  Note  that  although  the  bases  defined 
in  (4.6)  and  (4.7)  are  orthonormal,  they  are  not,  in  general,  orthogonal  to  <l>y(r)  and  ^j(t) 
which  are  the  basis  set  for  another  user  in  the  system,  user  ;,  since  s^it)  and  sjit)  are  not 
orthogonal  in  general.  The  result  when  the  received  signal  is  a  sum  of  K  component  sig¬ 
nals  is  MUL  We  now  define  four  parameters  which  are  a  measure  of  the  degree  of  corre¬ 


lation  between  the  basis  functions  of  the  different  users. 

oo 

OO 

Vjkil)  =  J  (4.9) 

oo 

WjkH)  =  J  ^it-lTb-^k)dt  (4.10) 

oo 

XjkiD  =  J  (4.11) 


These  parameters  play  the  same  role  in  the  super-symbol  view  of  the  coded  signal  that 
Pjk(l)  plays  for  the  standard  view  of  the  signal.  In  fact,  U,  V,  W  and  X  can  be  related  to  p 
directly  by  substituting  (4.4)  into  (4.6)  and  (4.7)  and  then  into  (4.8)  through  (4.11). 

(4.12) 


Ujki}')  —  P^(21+rwj^  ntj) 

VjkQ)  =  P/t(2/+wjt-Wy) 


(4.13) 
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=  P yik(2/+l+m]fc-m;)  (4.14) 

iOt(/)  =  Pyi(2/-l+m;fc-mj)  (4.15) 

(Recall  that  Xjt  =  m^T -Kjt,  e  [0,7),  and  m^t  e  {0,...,  j2“l  }•)  Note  that  l//fc(/)  =  VjkQ)  — 
Wjk(.l)=  Xjk(.l)  =  0  for  I/I>1;  this  fact  will  play  an  important  role  in  determining  the 
proper  state  description  of  the  system  for  the  optimal  sequence  estimator.  Some  other 

m  ^  ^ 

useful  properties  of  the  correlation  parameters  are  UjkQ)  =  Ukj(ri%  Vjkii)  =  Vkji-O  and 

Xjk(l)  =  Wkj(riy 

Beginning  with  equation  (4.5),  note  that  by  performing  a  modulo-Jf  decomposition 
of  the  index  i,  namely  i =a(i)i:+p(/)-l,  we  can  write 

This  assumes  that  the  K  users  transmit  (2M+V)/K  mformation  bits  each  in  the  time  inter¬ 
val  of  interest  and  that  the  signal  is  zero  outside  of  this  interval.  We  now  further  simplify 
the  notation  by  defining  the  following  terms. 


(4.17) 

i>i«^=Dfe))(a(0) 

(4.18) 

Uim  =  i/|j(.)Km)(a(»»)-a(0) 

(4.19) 

Vim  ~  i^P(f)P(m)(®^^)“®^(0) 

(4.20) 

Wim  =  Wp(i)p(„)(a(m)-a(0) 

(4.21) 

Xim  =Xp(i»M((x(m)-a(i)) 

(4.22) 

We  have  now  laid  the  foundation  for  the  derivation  of  the  MLSE.  This  development 
will  closely  follow  the  derivation  of  the  optimal  MLSE  in  [61]  and  [64]  for  the  uncoded 
ISI  chaimel. 

By  expanding  (4.16)  with  a  Karhunen-Loeve  expansion  and  letting  the  number  of 
basis  functions  grow  to  infinity,  we  obtain  the  following  waveform  metric: 
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A(D„)  =  -](r(l)-  Z  Di»(|.,K„((-a(Ori.^o)Vi® 

^  i^M 

(4.23) 

We  next  define 

(«(0)  =  J  ^  (0<l>ip(i)(^-«(076-Tp(i))<*  (4.24) 

and 

eo 

rP  =  (a(i ))  =  jr  (f)<l>2p(o  (4.25) 


which  represent  the  outputs  of  a  pair  of  matched  filters  or  correlators  for  the  basis  func¬ 
tions  for  user  p(0  at  time  r  =  (a(/)+l)ri, + By  expanding  (4.23),  and  then  coUecting 
the  appropriate  terms  we  get  the  following  metric. 

A(5i)  =  A(A_i)  +  2  [  dP^Ie^  (rP  -  ^fnPi  Uu-i+dP  Wh^J^Ie^) 

l=l 

+  dP^  (rP  -ZIdPXu-i+dPVh^J^'Ie^)]  (4.26) 
/=! 

where  Z),-  represents  the  multiuser  code-symbol  sequence  up  to  time  interval  i,  and  L  is 
the  smallest  integer  such  that  for  every  L  >L  we  have  Uj^ioiL))-  Vyjfc(a(L))  = 
WjkicxiL'))  =  Xjk(a(L))  =  0.  We  have  already  seen  that  the  correlation  parameters  are 
zero  when  I  a(L  )  I  >  1,  so  L  =  K-1. 

There  are  a  number  of  important  observations  that  can  be  made  from  the  path  metric 
given  in  equation  (4.26).  First  of  all,  the  stage  metric  depends  only  on  tire  code  sym¬ 
bols  in  the  set 


S  =  {I>P,  dP,  dPi,dPi  , . . . ,  dPWi.  dPk+i  },  (4.27) 

along  with  the  matched  filter  outputs,  and  rP,  as  well  as  the  signal  energies  and 
correlations.  It  is  possible  to  estimate  the  crosscorrelations  using  the  local  oscillators  and 
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code  generators  which  are  assumed  to  be  synchronized  to  the  K  components  of  the 
incoming  signal.  We  can  also  estimate  the  energies,  }  by  averaging  the  squared  out¬ 
puts  of  the  matched  filters  for  a  number  of  bits.  (The  number  over  which  they  would  be 
averaged  would  depend  on  the  rate  at  which  the  relative  strengths  of  the  users  is  varying.) 

For  any  user,  k,  die  convolutional  encoder  defines  a  mapping  rule  from  the  input 
information  symbol  and  present  state  of  the  encoder  to  the  code  bits, 

If  we  define  h  (•)  as  the  mapping  rule  from  the  input  information  symbol  and  state  to  die 
4-ary  super-code  symbol,  D^in)}  dien  by  substituting  the  information  sym¬ 

bols  that  define  the  state  of  the  encoder  in  for  the  state,  the  following  expression  may  be 
written: 

{Di\n),  DF(n)}  =h(/ifc(n),  4(n-l),...,  4(n-W+l) )  (4.28) 

Thus,  in  this  form,  it  is  clear  that  the  4-ary  super-code  symbol  depends  on  only  W  infor¬ 
mation  symbols.  Using  this  information,  it  is  easy  to  redefine  the  set  S  which  was  defined 
in  equation  (4.27)  in  terms  of  the  information  symbols  which  affect  the  i**  stage  metric. 

S={7i,o,}  (4.29) 

Of  =  ih-i  ♦  A-2»  •  •  • .  fi-irar+1 }  (4.30) 

where  li  =  /p(,)(ot(j )).  Thus  it  is  now  apparent  that  die  system  may  be  described  in  terms 
of  states,  since  the  information  symbols  are  binary.  Furthermore,  the  maximum 
likelihood  sequence  estimator  can  be  implemented  widi  a  Viterbi  algorithm  operating  on 
a  trellis  with  2’**“^  states  and  two  branches  per  state.  This  trellis  will  be  cyclically 
time-varying  as  in  the  uncoded  case,  [1].  Furthermore,  it  is  clear  that  this  trellis  reduces 
to  the  trellis  derived  in  [1]  when  the  constraint  length  of  the  code  is  one  (uncoded 
transmission  for  each  user).  Obviously,  the  number  of  states  in  the  MLSE  grows  very 
quickly  with  both  die  number  of  users  in  die  system  and  the  constraint  length,  W,  of  die 
codes  being  used.  In  fact,  for  a  simple  4-user  case  where  each  user  uses  a  W  =  3,  or  4- 
state  code,  the  MLSE  requires  a  Viterbi  algorithm  operating  on  a  trellis  with  2048  states! 


54 


The  time  complexity  per  bit  decoded  for  the  multiuser  MLSE  is  TCB  =  O  (2’’*) 
since  there  are  metrics  which  must  be  computed  at  each  stage  of  the  trellis  and 

one  information  bit  is  decided  at  each  stage.  Note  that  for  the  case  of  W  =  1,  the  TCB 
calculated  in  [1]  is  again  obtained. 

Now  that  the  MLSE  has  been  derived  for  the  rate*  1/2  case,  it  is  straightforward  to 
g<».nftrflii7o  to  the  case  of  rate-P/j2  convolutional  codes.  The  function  will  again 
have  to  be  constructed  from  a  set  of  orthogonal  basis  functions.  One  reasonable  choice 
would  be  a  set  of  C  non-overlapping  pulses,  each  of  duration  T.  Again,  the  function 
would  be  constructed  from  concatenations  of  Q  versions  of  Sk(fy.  The  metric  derivation 
could  then  proceed  in  the  same  fashion  as  in  the  rate-1/2  case.  There  are  2^  input 
hypotheses  to  test  in  each  7,  for  each  user,  so  the  overall  trellis  will  have  2^  branches  per 
state.  Furthermore,  the  state  of  the  system  will  be  specified  by  (K+P)(iSr-l)+K  informa¬ 
tion  bits,  so  it  will  have  states,  where  k = log2^  and  is  the  number  of  states  in 

the  single  user’s  encoder.  This  will  result  in  a  TCB  =  O  (2**‘'*’^^/P).  Because  the 
metrics  grow  with  K  as  well,  the  number  of  arithmetic  operations  which  must  be  per¬ 
formed  for  each  decoded  bit  is  0(,K2'^*^^/P).  This  complexity  measure  will  be  used 
throughout  Chapter  5  to  compare  the  complexity  of  the  various  receivers  as  well  as  the 
TCB. 

Clearly  the  exponential  dependence  of  the  TCB  on  the  number  of  users,  the  number 
of  states  in  each  of  the  user’s  codes  and  P  makes  the  use  of  die  optimal  decoder  prohibi¬ 
tive  for  a  realistic  system.  It  is,  however,  an  important  receiver  because  it  represents  the 
best  that  can  be  achieved  in  terms  of  sequence  error  probability,  and  it  will  provide  a 
good  baseline  by  which  to  judge  the  quality  of  suboptimal  schemes.  This  receiver  also 
raises  the  possibility  of  using  a  variety  of  sparse  searching  algorithms  like  a  sequential 
decoder  as  was  used  in  [14]  for  the  uncoded  case,  or  reduced  state  sequence  estimation 
techniq^fts  like  the  ones  proposed  in  [8]  and  [56]  for  the  uncoded  MUI  equalization  prob¬ 
lem  or  [47]  for  the  combined  equalization  and  decoding  problem  for  single-user  links 
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suffering  from  ISI. 

4^  Performance  of  flie  Optimal  Sequence  Estimator 

To  illustrate  the  derivation  of  some  performance  bounds  for  the  MLSE,  we  will 
again  use  the  rate  1/2  code  example.  In  this  analysis  we  will  fairly  closely  follow  the 
analysis  which  appeared  in  [1]  and  [61].  In  keeping  with  [1],  we  consider  the  decoding 
window  to  range  from  the  index  —M  to  the  index  Af.  The  goal  of  this  section  is  to  esti¬ 
mate  the  performance  of  the  optimal  sequence  estimator  by  bounding  the  finite  and 
infinite  horizon  error  probabilities  for  the  user  in  the  system,  denoted  P^(n)  and 
i>*=  lim  P^^(n). 

Consider  the  transmission  of  the  sequence  of  super-code  symbols,  D  = 
{dP,  £>P}  aad  a  competing  sequence  in  the  trellis  D+2e  corresponding  to  the 
sequence  {Z)P+2ei^\  where  ?=  “  a  sequence  of 

code  error  symbols.  Each  can  take  on  values  in  the  set  G  =  {0,  ±1 }.  Next,  define  the 
following  sets: 

B  =  {  ?:  c{9>eG,  i:^Af,...,Af,  q=l,2,  for  some  i,q  }  (4.31) 

A  (5)  =  { ?:  e  e  B,  5+2?  €  C  }  (4.32) 

C  =  {D:Deh({l)))  (4.33) 

We  see  above  that  B  is  the  set  of  error  sequences  which  have  a  nonzero  element  some¬ 
where  in  the  sequence,  A  (D)  is  the  set  of  error  sequences  in  B  which  have  the  property 
that  D +2?  is  a  valid  code  bit  sequence,  and  C  is  the  set  of  valid  code  sequences.  Finally, 
h(')  is  the  mapping  rule  defined  by  the  code  from  an  information  sequence,  7,  to  a 
sequence  of  super-code  symbols,  D,  as  in  (4.28).  Since  this  mapping  rule  is  a  one-to-one 
function,  it  has  an  inverse.  If  we  define  the  information  error  sequence 


\ir  =  /i-U7>+2?)-/ 


(4.34) 
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which  is  the  information  bit  error  sequence  corresponding  to  Z)  +  2c  such  that  if 
D-h(I),  then D  +  2?  =  A  (J+iji).  This  allows  us  to  define 

AjfiD,n)  =  {7:7eA (D),  (4.35) 


so  A^{D,n)  is  the  set  of  admissible  error  sequences  which  affect  the  information  bit 
of  the  fc*  user.  From  these  definitions,  it  follows  that  the  probability  of  error  for  the  n* 
bit  of  the  user  is  given  by 


P^(n)  =  2  P 

Dec 


{A(D+2c)>A(D)  Id  sent } 

j€A"(D,n) 


■P^iDsent) 


(4.36) 


As  is  the  usual  approach,  we  choose  to  bound  (4.36)  with  a  union  bound. 

Pf(«)^_2  E_  P(A(D+2e)>A(D)IDjenO*P"(5  senO  (4.37) 

Dec  eeAf{D,n) 


The  event  A(D  +  le)  >  A(D)  may  now  be  written  by  expanding  equation  (4.23)  and  sub¬ 
stituting 

=  rg?) (a(i))  =  Ufj+D^^^ Wij)  +  zg>  (4.38) 

J^-K+l 


and 


rP  =  /-Sa!  (0(0)  =  "Z  ^  (D}^>^^ Xij  +  Df  Vy)  +  zP  (4.39) 

j=i-K+l 

for  and  rP^  respectively,  where  zP^  and  zp^  are  the  noise  variates  at  the  output  of 
the  matched  filters  for  the  basis  functions  and  <|)2p(()  respectively  for  the  interval  ot(i). 
After  some  algebra,  the  following  expression  for  the  event  A(D +2?)  >  A(D)  is  obtained. 

I  S  (ef'eS' t/ta  +  ep’eg’Vi,  +  +  ePeS’Xim)  V£")£<"> 

m=-M 

<  I  >/i«(«!'>2('>  +  ePzP>)  (4.40) 

i=-Af 


Let  A^(?)  represent  the  left  side  of  equation  (4.40).  The  right  side  of  equation  (4.40)  is  a 
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linear  combination  of  Gaussian  random  variables,  and  zp^  It  is  not  difficult  to  show 
that  E  [zP^]  =  E  [zp^]  =  0  and  also  that 


E 


(4.41) 


As  a  result,  if  we  define  y  to  be  the  right  side  of  equation  (4.40),  then  it  is  not  difficult  to 
^ow  that£[y]=0  and  Var[y]s=  A^(?)*JV(/2. 

Next,  the  two-sequence  error  probability,  or  the  probability  of  the  event  given  in 
equation  (4.40),  becomes  the  probability  that  the  Gaussian  random  variable,  y,  is  larger 
than  fire  fineshold,  A^(c).  We  next  define  the  following  efficiency  parameter  for  the  pair 
of  sequences  differing  by  the  code  symbol  error  sequence  ?,  as 


A^(e)  _  A^(?) 
2£*  Eu,  ‘ 


(4.42) 


wh^e  Euc  =  2E*  is  the  energy  per  information  bit  for  user  k.  This  allows  us  to  write 


P  (A(Z)  +  27)  >  A(P)  I  D  sent)  =  Q 


V  No 


(4.43) 


so  ‘ni^(7)  is  the  asymptotic  efficiency  relative  to  uncoded  BPSK  transmission  for  the  ik* 
user  for  the  pair  of  sequences  D  and  D -1-27.  This  can  be  shown  to  reduce  to  the  form  of 
die  distance  measure  in  [1]  for  the  uncoded  system,  because  in  [1],  A^(7)  may  also  be 
expressed  as  the  L  2  norm  of  die  signal  generated  by  modulating  the  error  sequence. 

S(7,0=  X  eP)<|)ip(o(t-a(Orfr-xp(i))VF^+eP><|>2p(/)(t-a(/)rfr--fp(0)^ 

has  energy 

00 

I  I S(e,t)  I  P  =  J  1 5(?,r)  1 =  A2(7)  (4.45) 


This  implies  that  an  alternate  way  to  express  the  efficiency  parameter  defined  in  (4.42) 
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for  the  pair  of  sequences  D  and  D+eis 

(4.46) 

Ebk 

which  is  analogous  to  the  form  of  the  distance  measure  in  [1]  for  the  uncoded  system. 

In  order  to  construct  a  lower  bound  on  the  probability  of  error  for  user  k,  we  define 
the  following  minimum  efficiency  as 

nk,mn(n)-inf  inf_  r\ffie)  (4.47) 

Dec  eeAff(P,n) 


SO  that 


pjfin)  ^  P  [Tlf  (?) •  Q 


M 


V  No 


(4.48) 


Thus  we  now  have  a  lower  bound  expression  for  Pj^(n)  given  in  (4,48).  When  (4.43)  is 
substituted  into  equation  (4.37)  we  have  an  upper  bound  on  Pj^(n). 

To  obtain  bounds  for  the  infinite  horizon  error  probabilities  F*  =  lim  Pj^(i),  we 

may  use  the  same  argument  used  in  [1],  as  it  applies  equally  well  to  the  coded  case. 
Namely  for  any  error  sequence  e  such  that  ej=e„^  for  J^,  the  sequence 


m^J 

^m+n—J  ^ 


(4.49) 


satisfies  A^(e')<A^(e)  or  equivalently  'r]j^(e')^r\^(e),  else  it  would  be  possible  to  con¬ 
struct  a  sequence  with  a  negative  energy.  Thus,  we  may  conclude  exactly  as  in  [1]  that 
the  infinite  horizon  efficiencies  r\k(^  and  are  achieved  by  finite  length  error 

sequences.  As  a  result,  the  infinite  horizon  error  probability  for  the  user  may  be 
lower  bounded  by 


Pk^P[r\k(e)-^k.min]'Q 


(4.50) 
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Similarly,  by  passing  (4.37)  to  the  limit  as  M  approaches  infinity 


PiPsent)- 
Dec  eeAii(D,n) 


Q 


V  No 


(4.51) 


where  Ak{D,n)  -  lim  Ajf(D,n).  We  must  note  that  (4.51)  may  not  converge  for  all  noise 

levels.  In  [1],  the  convergence  region  was  increased  by  limiting  the  inner  sum  to  the  set 
of  indecomposable  sequences.  This  solution  perhaps  would  be  of  use  here  as  well  to 
obtain  a  tighter  upper  bound,  however,  we  will  not  focus  on  this  issue  here  because  die 
convergence  of  the  upper  bound  will  not  affect  the  rest  of  our  analysis. 

In  the  high  signal-to-noise  ratio  regime,  the  terms  in  (4.51)  with  the  minimum 
efficiency  will  dominate  the  asymptotic  behavior  of  the  receiver.  As  a  result,  we  will 
refer  to  the  minimum  efficiency,  the  asymptotic  multiuser  coding  gain  for  user  k 

{AMCG).  The  AMCG  is  an  efficiency  parameter  which  is  a  measure  of  the  energy  gain  or 
loss  of  the  system  relative  to  an  tmcoded  BPSK  system  operating  alone  with  an  energy 
per  information  bit  of  Observe  that  depends  on  the  crosscorrelations  and  rela¬ 
tive  energies  of  the  users  as  well  as  the  properties  of  the  code. 

In  the  limiting  cases  where  there  is  only  JST  =  1  user  in  the  system,  or  when  there  are 
K  users  in  the  system  with  perfectly  orthogonal  super-signature  sequences,  then  T)jt,mjh  is 
the  asymptotic  coding  gain  (ACG)  of  a  single-user  system  operating  with  the  same  code. 
In  the  special  case  where  the  users  do  not  employ  coding,  ‘t\k,imn  is  equivalent  to  the 
asymptotic  multiuser  efficiency  (AME)  obtained  in  [1]  for  the  optimal  multiuser  receiver 
for  the  uncoded  system.  Thus  the  asymptotic  multiuser  coding  gain  unifies  the  asymp¬ 
totic  coding  gain  and  the  asymptotic  multiuser  efficiency  parameters. 

%(e)  may  be  rewritten  as  a  quadratic  form, 

!!«(?)= ^er«f5rir«r.  («2) 


To  do  this,  we  define  the  vector  cp  to  be  the  subvector  of  the  infinite  length  error 
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sequence-  7  which  consists  of  all  of  the  nonzero  components  of  e  and  all  zero  components 
of  e  which  are  surrounded  by  nonzero  components.  If  we  assume  dial  die  dimension  of 
the  vector  e  I*  is  2rx  1,  and 


«r  =  f  (4-53) 

then  the  matrix  Hr  is  defined  as  H  r  =  where  the  sub  matrices  are  given  by 


Ujk  Wjt\ 

Xjt 


(4.54) 


Thus,  Hr  has  dimensions  2rx2r.  Also,  Er  is  &  diagonal  enMgy  matrix  with  diagonal 
elements  Ejj  =  (Epy))^. 

As  an  example,  consider  the  2-user  case  widi  each  user  employing  a  rate  1/2, 4-state 
convolutional  code,  as  is  shown  in  Figure  4.2.  If  user  1  sends  an  all-zeros  sequence,  and 
user  2  smids  all-zeros  except  for  stage  io>  where  a  1  is  sent,  then  a  valid  error  sequence  is 

?6  =  (-1  “1 1 1 0-1 0 1  -1  -1 1  if.  (4.55) 


Forthiscase,  assuming  that  mi  —m2  =0sodiatti  =Ti  andT2=T2>*h®'®^6  matrix  takes 
the  form 


1  0  P2I<U)  0 

0  1  Pad)  PjiTO 
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000  pj,(l) 
0  0  0  0 
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Figure  4.1  Maximum  Likelihood  Sequence  Estimator  for  a  convolutionally  encoded  CDMA  system. 
CE:  Convolutional  encoder. 


Figure  4.2  Rate  1/2, 4-state  convolutional  code. 
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and  if  the  users  have  equal  energy,  then  the  effective  efficiency  for  this  error  sequence  is 

r\i(e)  =  el  He  ee  =5-5p2i(0)-3p2i(l)  (4.57) 

This  implies  that  for  this  particular  case,  a  necessary  condition  for  the  MLSE  to  have  an 
asymptotic  loss  relative  to  a  single-user  system  is 

Tii(c)  =  5-5p2i(0)-3p2i(l)  <  df/2  (4.58) 

implying  that,  because  the  free  distance  of  the  code  in  use  is  df  =  5,if 

pjl(0)+|()2i(l)>|  (439) 

then  the  MLSE  decoder  will  not  achieve  a  single-user  performance  level  as  iVo/2— >0. 

In  the  same  case,  if  the  user’s  energies  are  not  equal, 

=  f  +  ^P21(0)+|P2.(1)  .  (4.60) 

This  may  be  considered  an  upper  bound  on  since  the  minimum  over  all  valid  error 
sequences  is  no  larger  than  the  Tijfc(e)  for  a  particular  valid  error  sequence. 

In  general  an  interesting  result  is  obtained  when  we  examine  T|jt(^  for  ?  sequences 
involving  only  single-user  errors.  Note  that  for  every  ?€  Aic(P,n)  such  tiiat  every 
nonzero  element  of  e  corresponds  to  user  k,  (in  other  words,  only  user  k  is  involved  in  the 
error  event) 

r^r«r  =  2^'^* 

where  wr  [  ?  ]  is  the  weight  or  number  of  nonzero  elements  of  e  (or  equivalently  e  r). 
M>tk\e\  is  the  weight  of  user  k's  subsequence  of  e.  (User  subsequence  is  the  set  of  all 
{e\^\  }  in  e  such  that  P(i)  =  fc.)  Because 

nfin  wr*[?]  df 

eeMD.n)  - - - =  — 

Dec  ^  ^ 


(4.62) 
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we  have  the  result  that  !!*(?)  ^  df/2  for  every  e  e  AkiD,n)  such  that  every  nonzero  ele¬ 
ment  of  ?is  contained  in  user  k's  subsequence. 

This  result  is  important  because  it  implies  that  single-user  error  events  are  not 
responsible  if  the  AMCG  is  less  dian  the  ACG  of  a  single  user  system.  We  thus  must 
examine  multiple-user  error  events  to  find  <  df/2,  (  Recall  that  df/2  is  the  ACG 
for  a  rate  1/2  convolutional  code). 


In  general,  the  computation  of  involves  a  search  over  all  ?eA*(Z>,n)  for  each 
DgC  reference  sequence.  Rather  than  attacking  this  problem  directly,  we  will  lower 
bound  die  worst-case  efficiency  for  the  2-user  situation,  and  will  then  illustrate  some  nice 
properties  of  the  MLSE  using  this  bound. 


By  studying  the  matrix  for  this  2-user  case,  we  can  obtain  a  lower  bound  on  die 
result  of  equation  (4.52)  in  the  following  way.  Every  nonzero  element  of  ?r  will  multi¬ 
ply  its  corresponding  element  of  ?f » tb©  corresponding  diagonal  element  of  Hj-  and  be 
weighted  by  the  energy  for  that  element  We  thus  have  as  a  part  of  the  result  of  (4.52) 
the  weight  of  user  I’s  error  subsequence  multiplied  by  £i  plus  the  weight  of  user  2’s 
error  subsequence  multiplied  by  E2.  The  remaining  terms  in  the  result  of  (4.52)  are  due 
to  the  product  of  elements  of  7r  with  other  elements  of  ep.  weighted  by  the  off-diagonal 
elements  of  Hr  and  (EiE2)*'^-  If  we  lower  bound  the  addition  of  these  off  diagonal 
terms  by  a  number  that  is  smaller  than  is  achievable  by  the  actual  off-diagonal  terms, 
then  we  have  a  lower  bound  on  equation  (4.52).  One  possible  lower  bound  on  the  off 
diagonal  terms  leads  to  the  following  expression  which  is  only  a  function  of  the  weight  of 
the  error  sequences.  It  turns  out  that  this  expression  is,  in  most  situations,  a  somewhat 
loose  lower  bound  on  Tijt(?).  We  will  focus  on  the  performance  of  user  1  without  any  loss 
in  generality. 


Tii(e)Smin{/ 


E2 


El 


V4 


,wri[e],wr2[^.d 


.  } 


(4.63) 


where 
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«2 
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,wri[7],wr2[^Ld 


E2 

=  V'2(w?i[?]  +  -^W^2[«]  - 


£2 


£1 


V4 


(2  miii{  wii  [c],  wf2l?]  }+2)0 


(4.64) 


and  where  C  =  •  P21  (0)  •  +  I P21  (1)  I  •  The  function  /  (•)  is  a  lower  bound  on  Tii  (?)  as  long 
as  ?  has  w^l[?]  >  0  and  H'f2l?]  >  0.  We  have  already  seen  from  (4.61)  and  (4.62)  that 
d/2  is  a  lower  bound  on  tii(?)  when  wt2[i^  =  0,  so  the  smaller  of  these  two  expressions 
is  less  than  T|i(?)  for  all  eeA^(p,n). 

For  a  fixed  set  of  crosscorrelations  and  signal  energies,  thus  a  constant  C  and  con¬ 
stant  >jE2/Ei ,  the  function  /  {-^E^/Ei  .wfit?],  wr2l?].0  describes  a  family  of  parabo¬ 
las,  one  for  each  value  of  wt  1  [?]  and  wr 2  l?l-  It  is  easy  to  show  that 


min 

wf, («)€{<//, 4+1,...} 


V4 

,W/i[?],Wf2l?],C 


(4.65) 


This  result  implies  that 


'ni.mm  ^  ^ 
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(4.66) 


which  means  that  we  have  lower-bounded  the  AMCG  by  a  function  which  depends  only 
on  the  user’s  energies,  crosscorrelations  and  the  free  distance  of  the  code.  This  bound 
on  is  valid  only  for  the  2-user,  rate  1/2  code  case,  but  it  will  illustrate  some  very 
important  features  of  the  performance  of  the  MLSE  which  should  remain  true  for  the 
general  i^-user,  rate  P/Q  code  cases  as  well.  Ihis  bound  will  illustrate  these  performance 
features  without  requiring  a  solution  to  the  NP-hard  problem  of  searching  for  the  actual 
error  sequence,  ?,  and  corresponding  reference  sequence,  D,  which  achieve  the  actual 

The  first  feature  of  the  bound  in  (4.66)  may  be  noted  by  examining  the  plot  of  F(‘) 
as  a  function  (£2/^1)'^  shown  in  Figure  4.3  for  ^  =  0.6  and  df  =  5.  As  the  interfering 
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signal  strength,  Ei  becomes  small  relative  to  £i,  F(')  approaches  the  ACG  of  the  single 
user  system.  Also,  as  Ei  becomes  large  relative  to  Ex,  F(-)  again  reaches  the  ACG  of  a 
single  user.  In  fact,  for 

>  2C-^  (4.67) 

the  MLSE  necessarily  will  have  the  same  asymptotic  performance  as  that  of  a  single-user 
system,  hi  fact,  because  F(*)  is  only  a  lower  bound  on  the  AMCG  of  the  receiver,  the 
actual  energy  ratio  above  which  single-user  performance  is  achieved  may  be  significantly 
lower  than  the  threshold  given  in  (4.67).  This  point  may  be  illustrated  by  die  dashed  line 
in  Figure  4.3  which  is  the  upper  bound  given  in  equation  (4.60).  Without  performing  the 
search  for  we  do  not  know  whether  the  T|i(i)  shown  for  that  particular  ?  is  the 

minimum,  but  if  it  is,  then  the  actual  threshold  for  {E^/Exi^  above  which  single-user 
asymptotic  performance  is  achieved  would  be  0.96. 


Another  interesting  feature  of  the  bound  in  equation  (4.66)  is  that  it  provides  a 
lower  bound  on  the  near-far  resistance  of  the  MLSE,  which  is  defined  as  the  infimum  of 
'^k,min  o^er  the  energies  of  the  interfering  users.  [4]  This  infimum  for  the  function  F(')  is 


inf 

(Ej/E,r^  [0,. 


^min 


t  inf  F 

(£2/£ir€[o.-) 


’\E2 

'A 

X.d/ 

2  2d, 


(4.68) 


which  is  positive  for 

C=lp2l(0)l+lp2l(l)l<'^ 


(4.69) 


A  strictly  positive  near-far  resistance  implies  that  the  receiver  will  have  a  performance 
that  goes  to  zero  at  the  same  exponential  rate  as  a  single-user  system  operating  with  an 
energy  penalty  of  Til, 

It  is  also  interesting  to  note  that  as  the  code  which  is  employed  becomes  more 
powerful,  or  as  df  increases,  the  conditions  on  the  crosscorrelations  of  the  users  become 
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progressively  less  restrictive  to  achieve  near-far  resistance.  In  other  words,  a  stronger 
code  allows  the  MLSE  to  remain  near-far  resistant  on  a  channel  with  more  severe  MUI 
than  would  be  possible  with  a  weaker  code.  Again,  however,  because  F(0  is  simply  a 
lower  bound  on  the  actual  AMCG  may  be  positive  when  the  minimum  of  F(-)  is 
not.  Nonetheless,  the  fact  that  (4.69)  implies  a  positive  lower  bound  is  an  interesting 
feature  of  the  bound  in  (4.66). 

43  Simulation  Results 

To  provide  some  direct  comparisons  between  the  performance  of  the  MLSE  and  the 
conventional  receiver  in  terms  of  bit  error  rate  at  moderate  to  low  Ei/Nq,  we  will  use  a 
computer  simulation  for  some  two-user  cases.  Figure  4.4  diows  the  results  of  a  simula¬ 
tion  of  a  two-user  system  where  each  user  employs  a  4-state  rate  1/2  convolutional  code. 
The  resulting  super-trellis  used  by  the  MLSE  has  32  states.  Figure  4.4  illustrates  a  fairly 
severe  MUI  envirorunent  where  Pi2(0) =0.3  and  Pi2(“l)  =  0.3.  In  this  case,  the  MLSE 
is  able  to  recoup  almost  aU  of  the  loss  diat  the  conventional  decoder  suffers  when  com¬ 
pared  with  the  performance  in  the  single-user  envirorunent  In  Figure  4.5,  the  same  0.3 
chaimel  is  simulated  for  a  varying  near-far  energy  ratio.  This  figure  shows  that  tire 
MLSE  approach  achieves  a  single-user  performance  level  for  sufficiently  strong  or 
sufficiently  weak  interference.  This  result  is  supported  by  the  asymptotic  performance 
suggested  by  the  bound  in  Figure  4.3.  In  addition,  equation  (4.68)  suggests  that  the 
MLSE  is  near-far  resistant  for  this  case,  since  ^  =  0.6  and  df  =  5.  Also,  the  upper  bound 
on  the  AMCG  in  Figure  4.3  suggests  that  there  is  not  necessarily  an  asymptotic  loss  for 
the  MLSE  relative  to  the  single-user  performance  level  in  the  equal-energy  case  since  the 
AMCG  is  upper  bounded  by  2.5  at  an  energy  ratio  of  one.  This  is  supported  by  the  simu¬ 
lation  in  Figure  4.4. 

Figures  4.6,  4.7  and  4.8  show  the  performance  curves  for  a  more  severe  two-user 
channel  with  Pi2(0)  =  0.4  and  Pi2(-1)  =  0.4  and  the  same  code.  In  all  of  these  graphs. 


nventional  coded 
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Figure  4.4  Performance  curves  of  the  MLSE  (dotted  line)  for  a  2-user  channel  with  pi2(0) = 0.3 
Bod  Pi2(~l)  -0.3  and  equal  energies.  Ihe  soUd  lines  show  a  single-user  system  (no  MUI)  widi 
and  ixdthout  the  rate- 1/2  4-state  convolutional  code  and  dso  a  multiuser  system  ^th  a  convmi- 
tional  receiver. 
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Figure  4.5  Near-far  ratio  performance  curves  of  the  MLSE  on  a  2-user  channel  with  pi2(0)  =  0.3 
and  Pi2(-1)=0.3  at  EbiWo  =  2dB.  The  single-user  system  performance  level  (no  MUI)  with 
the  rate-1/2  4-state  convolutional  code  is  shown  as  a  solid  line  and  the  MLSE  performance  is 
shown  as  a  dotted  line. 


V£2/£i 

Figure  4.6  Plot  of  lower  bound  on  for  the  2-user,  pji  (0)  =  p2i (1)  =  0.4  case  with  each  user 
employing  the  code  shown  in  Figure  2.  Also  shown  is  the  actual  tij  (c)  for  the  specific  error  event 
given  in  equation  (4.55). 
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Figure  4.7  Performance  curve  of  the  MLSE  for  a  2-user  chaimel  with  pi2(0)  =  0.4  and 
Pi2(-1)  =0.4  and  equal  energies.  The  solid  lines  show  a  single  user  system  (no  MUI)  with  and 
without  the  rate- 1/2  4-state  convolutional  code. 


^b2^b  \  (dB) 

Figure  4.8  Near-far  ratio  performance  curve  of  the  MLSE  a  2-user  channel  with  Pi2(0)  =  0.4  and 
Pi2(“1)  =  0.4  at  £i|/Wo  -  2dB  for  the  rate-1/2  4-state  convolutional  code  case. 
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we  see  that  the  performance  is  worse  than  in  the  corresponding  figures  for  the  0.3  chan¬ 
nel,  but  again  the  MLSE  is  near-far  resistant. 

It  is  worth  noting  that  all  of  the  performance  analysis  in  this  chapter  has  been  based 
upon  the  metric  for  the  case  where  each  user  in  the  system  employs  rate- 1/2  convolu¬ 
tional  codes.  The  expression  for  the  distance  and  asymptotic  multiuser  coding  gain  will 
be  more  complicated  in  the  general  rate-P/Q  code  case,  but  the  derivation  procedure  will 
be  the  same.  Thus  the  work  in  this  chapter  is  meant  to  illustrate  the  general  procedure  for 
the  error  analysis  of  the  more  complex  general  code  rate  case. 

In  conclusion,  in  this  chapter  the  maximum  likelihood  sequence  estimator  was  for¬ 
mulated  for  CDMA  systems  where  each  user  employs  a  convolutional  code  to  improve 
its  performance.  It  was  shown  that  the  complexity  of  the  MLSE  has  an  exponent  given 
by  the  product  of  W  and  K  (in  the  rate  1/2  code  case).  This  high  complexity  points  to 
the  use  of  suboptimal  approaches  to  attempt  to  attain  high  performance  levels  with  a 
more  reasonable  complexity.  In  the  next  chapter  we  will  pursue  this  goal  by  introducing 
a  number  of  suboptimum  receiver  architectures  that  are  appropriate  for  coded  CDMA 
links. 
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Chapter  5  Suboptimum  Multiuser  Receivers  for  Convolutionally 
Coded  CDMA  Links 

In  the  last  chapter,  the  ML  sequence  estimator  was  introduced  and  we  saw  that  its 
performance  was  significantly  better  than  the  conventional  receiver’s,  however  its  com¬ 
plexity  was  prohibitively  high.  Motivated  by  (he  need  for  low  complexity  receivers  with 
a  performance  level  that  is  similar  to  the  optimal  sequence  estimator’s,  we  search  in  diis 
chapter  for  low-complexity  suboptimal  receivers.  Figure  5.1  oudines  the  various 
approaches  that  will  be  examined  in  this  chapter. 

Through  an  asymptotic  analysis  and  simulation,  it  will  be  shown  that  these  mul¬ 
tiuser  detection  techniques  are  able  to  significantly  improve  the  performance  of  the  con¬ 
ventional  basestation  architecture.  Li  the  last  chapter,  an  important  performance  meas¬ 
ure,  named  the  asymptotic  multiuser  coding  gain  (AMCG),  was  introduced.  This  parame¬ 
ter  may  be  defined,  in  general,  as  the  required  energy  of  a  binary  antipodal  single-user 
receiver  which  achieves  the  same  performance  as  the  multiuser  receiver  (as  the  noise 
power  approaches  zero),  divided  by  the  required  energy  of  a  single  -user  binary  antipo¬ 
dal  receiver  for  an  uncoded  link.  Recall  that  this  parameter  reduces  to  the  familiar 
asymptotic  multiuser  efficiency  (AME)  parameter  for  the  uncoded  multiuser  case,  and  to 
the  asymptotic  coding  gain  (ACG)  in  the  single-user  coded  case.  Several  of  the  decision 
feedback  approaches  which  will  be  studied  in  this  chapter  do  not  lend  themselves  to  an 
analysis  in  terms  of  AMCG.  As  a  result,  these  approaches  will  be  compared  with  die 
important  baseline  architectures  via  a  computer  simulation. 

Rather  than  introducing  die  suboptimum  receivers  of  Figure  5.1  in  die  order  that 
they  appear  in  the  figure,  it  will  be  preferable  to  first  discuss  the  partitioned  approaches, 
and  then  to  discuss  the  combined  equalization  and  decoding  approaches  afterwards.  This 
presentation  will  provide  a  unified  view  of  the  various  approaches. 
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Demodulators  for  CDMA  systems  with  convolutional  codes 


With  Decoding 
Only  (no  Equalization) 
’Conventional’ 


With  Equalization 
And  Decoding 


linear  Equalization 


sqtarate 

equalization 


Decision  Feedback 
Equalization 


Trdlis  Based 


combined 

equalization 


sqjarate 
equalization 


combined 

equalization 


equalization 


combined 
equalization 
a^  decoding 


MLSE 

(optimal  sequence 
estimator) 


HgureS.l:  Tree  diagram  of  the  possible  recdver  structures  for  CDMA  systems  operating  with  convolutional  codes. 
All  of  the  partitioned  approaches  (s^arate  equ^ation  and  decoding)  can  be  implemented  in  a  hard  or 
soft  decision  form. 

(The  approaches  in  boxes  will  be  discussed  in  fois  (duper) 
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5.1  Hard'Decision  Partitioned  Approaches 

The  broad  class  of  multiuser  receiver  architectures  which  treat  equalization  of  the 
MUI  and  decoding  of  the  code  separately  as  shown  in  Figure  5.2  will  be  referred  to  as 
partitioned  multiuser  receivers.  Within  this  class  of  partitioned  approaches  are  those 
which  use  a  hard-decision  multiuser  receiver  to  supply  hard  decisions  from  Ae  inner 
channel  to  a  bank  of  outer  Viterbi  decoders,  and  Aose  which  use  soft  decision  multiuser 
receivers  to  supply  soft-decisions  to  Ae  outer  decoders.  For  Ae  hard-decision  case, 
sufficient  inteileaving  can  provide  Ae  outer  decoders  wiA  statistics  which  can  be  accu¬ 
rately  modeled  as  Ae  ou^uts  of  a  bank  of  K  binary  symmetric  chaimels.  This  level  of 
sufficient  interleaving  will  typically  be  achieved  wiA  a  block  interleaver  which  has  a 
widA  equal  to  Ae  release  depA  of  Ae  outer  Viterbi  algorithms  (roughly  five  times  Ae 
constraint  lengA,  W)»  and  a  depA  greater  Aan  Ae  average  lengA  of  an  error  event 
(which  is  only  a  few  code  symbols  at  high  SNR). 

Before  analyzing  Ae  performance  of  a  hard-decision  partitioned  multiuser  receiver, 
it  will  be  useful  to  first  consider  Ae  performance  of  a  hard-decision  receiver  operating  on 
a  coded  link  wiA  no  interferers.  Consider,  wiAout  any  loss  in  generality,  Ae  perfor¬ 
mance  of  user  k  operating  in  isolation.  For  this  case,  Ae  crossover  probabili^  for  Ae 
binary  symmetric  chaimel  is 


since  =  Rc^bk-  1^6  first-event  error  probability  in  Ae  Viterbi  decoder  can  be  bounded 
by 

Pe^^adPM  (5.2) 

d=df 

where  aj  is  Ae  multiplicity  of  paAs  wiA  a  Astance  d  from  Ae  desired  paA  and  P^id)  is 
Ae  probability  of  confusing  two  sequences  which  ate  d  Hamming  units  apart.  The  bit 
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enor  probability  can  be  bounded  by  a  similar  sum.  If  we  define 


t  = 


id 

2 


(5.3) 


where  the  function  [xj  gives  the  next  integer  smaller  than  or  equal  to  x.  As  Nq /2-^0, 
the  first-event  error  probability  becomes  dominated  by  the  leading  term  in  the  series 
(5.2),  namely 

Pe=aj,P2idf)=aj,p'i,*'  asNo/2-^  (5.4) 


This  term  may  be  upper-bounded  using  (5.1)  and  the  asymptotically  tight  upper  bound 
Q{x)<¥iexp(rx'^n). 


1+1, 


No 


Rc  (f +1)^ 


(5.5) 


It  follows  firom  (5.4)  and  (5.5)  that  the  ACG  for  this  hard-decision  receiver  is 


ACG=Rc(t+l)=Rc 


(5.6) 


in  terms  of  an  absolute  ratio,  and  in  terms  of  decibels  osACGjb  -  101og(ACG). 

With  this  single-user  case  as  a  foundation,  we  may  proceed  to  analyze  the  AMCG 
for  a  partitioned  hard-decision  multiuser  receiver.  Consider  again  the  receiver  illustrated 
by  Figure  5.2.  If  the  interleaving  is  perfect,  in  the  sense  that  it  has  an  adequate  depth  to 
provide  a  memoryless  iimer  channel  for  the  outer  decoders,  then  the  overall  channel  may 
be  modeled  as  a  bank  of  K  binary  symmetric  charmels.  The  crossover  probability  for  the 
user’s  binary  symmetric  channel  is  then  given  by 


Pk^ 


X  hQ 


(5.7) 


where  is  the  AME  of  the  user’s  multiuser  receiver  which  operates  on  the 

sequence  of  code  symbols  as  though  they  were  uncoded  symbols,  and  b-^  is  the  effective 
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HgureS.2  CDMA  link  wifli  a  partitioned  receiver.  Ifthe  interleaving  can  successfully 
break  up  tbe  inner  charmers  memcny,  then  die  link  within  die  dotted  box 
may  be  accuratdy  modeled  as  JIT parallel  BSCs  in  the  hard-dedsion 

multiuser  receiver  case. 

(r  s  n  if  there  is  no  interleaving  on  the  link) 
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multiplicity  of  competing  sequences  of  distance  T].  Equation  (5.2)  again  gives  the  first- 
event  error  probability  and  the  first  term  of  this  series  will  again  dominate  for  low  noise 
situations. 


as  No/2-^0  (5.8) 


As  in  (5.5),  this  leading  term  may  be  upper  bounded  as  follows 


% RcVi&n^  (^+1)  (5.9) 


Because  this  bound  is  asymptotically  tight,  we  have 


AMCGjt  —  min  “  2  ^ 


(5.10) 


as  long  as  the  interleaving  is  perfect 

This  is  an  important  result  because  it  is  a  simple  relation  for  the  AMCG  of  the 
overall  hard-decision  partitioned  multiuser  receiver  in  terms  of  1)  the  code  rate,  2)  the 
free  distance  of  the  code  and  3)  the  AME  of  the  multiuser  receiver  which  is  employed  for 
making  code  bit  decisions.  The  AME  has  been  computed  for  many  multiuser  receivers 
on  uncoded  links,  and  so  using  those  results  from  the  literature,  we  may  easily  compute 
die  AMCG  for  the  perfecdy  interleaved  hard-decision  partitioned  receiver  of  interest 

We  now  state  some  of  the  AME  expressions  from  the  literature  and  use  these  results 
to  compute  the  AMCG  using  (5.10).  In  [2]  and  [3],  the  AME  for  user  one  in  a  two-user 
system  is  given  for  the  optimum  sequence  estimator  as 


In  [4],  the  AME  for  user  one  in  a  two-user  system  is  given  for  the  deconelator  with  an 
infinite  horizon  as 


^4  .  - 


=  \[1-(P21  (0)+p21  (1))"][1-(P21  (0)-p2l  (1))"] 


(5.12) 


In  [2]  the  AME  of  the  conventional  receiver  was  derived  for  the  2-user  case,  and  the 
result  was 


where  as  in  Chapter  4,  C=  Ip2i(0)l  +  IP2i(l)l-  Finally,  in  [29],  an  expression  for  die 
AME  of  the  2-stage  Varanasi  multistage  DFE  is  given.  This  expression  will  not  be 
repeated  here  for  the  sake  of  brevity,  but  instead  the  interested  reader  can  refer  to  [29]. 

Using  all  of  these  expressions  and  (5.10),  we  may  plot  the  AMCG  for  a  hard- 
decision  partitioned  receiver  in  a  2-user  system  with  a  conventional  irmer  receiver,  a 
decorrelator,  and  a  ML  sequence  estimator.  In  Figures  5.3,  5.4,  5.5  and  5.6,  the  AMCG 
for  user  1  is  plotted  versus  “^Ei/Ex  for  some  specific  codes  and  chaimel  conditions.  In 
Figure  5.3,  the  curves  are  plotted  for  the  case  where  both  users  employ  a  rate  1/2  4-state 
code  with  df=5,  and  P2i(0)  =  P2i(l)  =  0.2.  For  this  code,  by  equation  (5.6)  we  know 
that  the  ACG  for  user  one  operating  in  isolation  is  ACG  =  1.5,  or  101og(1.5)  dB.  Figures 
5.4  and  5.5  show  the  AMCG  versus  near-far  energy  ratio  again  for  the  same  code  with 
df=  5,  but  this  time  with  P2i(0)  =  P21  (1)  =  0.3  in  Figure  5.4  and  P2i  (0)  =  P21  (1)  =  0.4  in 
Figure  5.5.  These  figures  illustrate  that  as  the  channel  cross-correlations  become  greater, 
the  achievable  multiuser  coding  gain  for  the  partitioned  receivers  drops,  although  the 
benefits  of  coding  remain  die  same.  Figure  5.6  shows  the  same  curves  for  the  case  where 
the  codes  employed  are  rate  1/2  64-state  codes  which  have  df=  10,  again  with 
P2i(0)  =  P2i(1)  =  0-3.  From  this  figure  and  equation  (5.10),  it  is  clear  that  a  stronger 
code  is  able  to  improve  the  achievable  multiuser  coding  gain  given  the  xanrift  nhannf.1 
conditions,  (compare  with  Figure  5.4) 
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Figure  5.3  Plot  of  AMCG  for  the  2-user,  p2i(0)  =  p2i(l)  “0.2  case  where  both  users  employ  a  rate  1/2 
4-state  code  with  df  =  5.  The  ACC  for  a  single-user  system  using  this  code  is  10  log  (1.5)  dB  for  a  hard- 
decision  decoder. 
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Rgure  5.4  Plot  of  AMCG  for  the  2-user,  p2i  (0)  =  p2i  (1)  =  0.3  case  where  both  users  employ  a  rate  1/2 
4-state  code  with  d;  =  5.  TTie  ACG  for  a  single-user  system  using  this  code  is  10  log  (1.5)  dB  for  a  hard- 
decision  decoder. 


Figure  5.5  Plot  of  AMCG  for  the  2-user,  p2i  (0)  =  p2i(l)  =  0.4  case  where  both  users  employ  a  rate  1/2 
4-state  code  with  df  =  5.  The  ACG  for  a  single-user  system  using  this  code  is  10  log  (1.5)  dB  tor  a  hard- 
decision  decoder. 


^E2/E, 

Rgure  5.6  Plot  of  AMCG  for  the  2-user,  p2]  (0)  =  P21  (1)  =  0.3  case  where  both  users  employ  a  rate  1/2 
64-state  code  with  d/  =  10.  The  ACG  for  a  single-user  system  using  this  code  is  10  log  (2.5)  dB  for  a 
hard-decision  decoder. 
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The  complexity  of  the  various  receivers  may  be  measiired  with  the  time  complexity 
per  decoded  bit  The  overall  TCB  will  be  the  sum  of  the  TCB  of  the  outer  Viterbi 
decoders  and  Q  times  the  TCB  of  the  inner  multiuser  receiver  since  Q  code  bits  must  be 
decided  for  every  stage  of  the  outer  decoders.  If  we  assume  that  the  code  is  a  rate  F/Q 
code  and  has  a  binary  memory  order  of  k  bits,  then  the  outer  Viterbi  algorithms  will  have 
a  TCB  =  0  (Note  that  if  code  puncturing  is  used  to  obtain  a  rate  P/Q  code 

from  a  rate  1/Q  code,  dien  die  complexity  of  the  outer  Viterbi  algorithms  will  be  TCB  = 
0(2**'^)  since  there  are  2^  states  and  2  branches  per  state.)  Furthermore,  the  conven¬ 
tional  iimer  receiver  will  have  TCBconv.  =0(1),  the  MLSE  inner  receiver  will  have 
TCBmlse  =  0(2^)  and  the  J-stage  DFE  inner  receiver  will  have  TCBdfe  =  0(J)  assum¬ 
ing  that  the  complexity  of  one  MUI  calculation  is  roughly  equivalent  to  a  metric  calcula¬ 
tion  (which  it  often  is  not).  It  may  be  more  useful  to  compare  the  rough  number  of  arith¬ 
metic  operations  (or  multiplications)  per  decoded  bit  to  make  a  comparison  with  the 
decorrelator.  For  the  MLSE  this  is  0(Jir  2^),  and  for  the  J-stage  DFE  this  is  0(JK).  We 
know  from  Section  2.1.2  that  the  decorrelator  requires  roughly  0(^K)  operations  per 
decoded  bit  if  5  is  the  impulse  response  truncation  depth  of  a  decorrelator  which  is 
implemented  with  an  FIR  matrix  filter.  It  follows  that  the  overall  number  of  arithmetic 
operations  per  decoded  information  bit  for  the  hard  decision  partitioned  receivers  is  on 
the  order  of  OPconv-0(2^*^/P)  for  the  conventional  inner  receiver, 
0Pmlse-0([QK2^+2'^*^VP)  for  the  MLSE  inner  receiver, 
0PjDFE-0([QKJ+2'^^^yP)  for  the  J-stage  DFE  inner  receiver  and 
OPdec.  =  O  ([QK&¥2'^'*‘^yP)  for  the  decorrelator  inner  receiver. 

Another  consideration  in  the  choice  of  receiver  architecture  might  be  the  decoding 
delay  for  the  receiver.  The  overall  decoding  delay  for  a  partitioned  receiver  will  be  the 
sum  of  the  decoding  delay  of  the  iimer  receiver  with  that  of  the  deinterleaver  with  that  of 
the  outer  Viterbi  decoder.  The  decoding  delay  of  the  outer  Viterbi  decoders  will  typi¬ 
cally  be  a  few  times  the  constraint  length,  W,  of  the  code  in  use.  We  may  assume  for 
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comparison’s  sake  that  this  delay  is  Ava  =  5WT,  seconds.  The  decoding  delay  for  the 
conventional  inner  receiver  will  be  zero,  except  for  maybe  the  delay  associated  with  the 
quantization  process,  (which  will  be  negligible).  A  J-stage  DFE  wUl  have  a  decoding 
delay  of  Ajdfe  =  (J-l)Ts/Q.  An  MLSE  which  has  a  release  depth  of  5K  stages  will 
have  a  delay  of  roughly  A^lse  =  5KT/K  =  STj/Q  seconds.  A  decorrelator  or  MMSE 
linear  inner  receiver  with  an  impulse  response  truncation  depth  of  5  taps  will  have  a 
decoding  delay  of  Ajec.  -  67/2  =  T^S/IQ  seconds.  This  implies  that  if  we  neglect  die 
deinterleaver’s  delay  since  it  will  presumably  be  the  same  for  all  receivers,  we  get  the 
following  overall  delays  for  the  hard-decision  partitioned  receivers:  Ap  \conv  ^  SWTf  for 
the  conventional  inner  receiver,  Ap  \mlse  =  5r,/Q  +  SWT,  for  the  MLSE  inner  receiver, 
A/»i/dfe  =  +5Wr,  for  the  J-stage  DFE  inner  receiver  and 

Ap  Idee.  ~  hTg/lQ  +  SWTs  for  the  decorrelating  inner  receiver.  It  is  worth  noting  that  all 

% 

of  these  decoding  delays  are  roughly  the  same. 

5,2  Soft-Decision  Partitioned  Approadies 

The  computation  of  the  AMCG  for  the  soft-decision  partitioned  approaches  is  more 
difficult  than  for  the  hard-decision  case.  We  will  have  to  write  expressions  for  the  deci¬ 
sion  statistics  at  the  outer  Viterbi  decoders  for  the  various  inner  multiuser  receivers,  and 
then  upper  bound  the  worst-case  values  of  these  decision  statistics  to  obtain  lower  bound 
expressions  for  the  AMCG  of  the  overall  receivers.  It  is  interesting  and  important  to  note 
that  the  conventional  receiver  may  be  viewed  as  a  member  of  the  class  of  soft-decisioned 
partitioned  receivers  with  a  degenerate  multiuser  receiver  which  simply  passes  the 
matched  filter  outputs  to  the  outer  Viterbi  algorithms  without  altering  them.  As  a  result, 
by  analyzing  the  multiuser  receivers  in  this  class,  we  will  also  be  analyzing  the  important 
conventional  receiver’s  performance. 

Before  analyzing  the  soft-decision  partitioned  multiuser  receivers  in  detail,  how¬ 
ever,  it  will  again  be  useful  to  first  consider  the  single-user  case.  To  analyze  the  ACG  for 
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a  soft-decision  receiver  operating  in  isolation,  we  begin  by  bounding  the  first-event  error 
probability: 


(5.14) 

d=df 

where  P2id)  is  the  probability  of  confusing  two  sequences  which  are  a  distance  d  Ham¬ 
ming  units  apart.  The  bit  error  probability  wiU  be  bounded  by  a  sum  which  is  similar  to 

(5.14).  On  a  standard  additive  white  Gaussian  noise  channel,  the  two  sequence  error  pro¬ 
bability  is  given  by 


P2i.d)  =  Q 


Thus  using  the  asymptotically  tight  bound  Q  (x)  ^  Vi  exp(-x^/2),  we  have 


Ui 

Pe^  L^exp 
d=d, 


^bk 

No 


dK 


(5.15) 


(5.16) 


AsNo/2-¥  0,  the  first  event  error  probability  becomes  dominated  by  the  leading  term  in 
the  series.  Thus 


Ebk 

No 


dfRc 


as  No /2-^ 


and  we  may  recognize  the  asymptotic  coding  gain  from  (5.17)  as 


(5.17) 


ACG=dfRc  (5.18) 

which  is  usually  expressed  as  ACGas  =  101og(ACG)  dB.  When  this  is  compared  to  die 
hard-decision  result  in  equation  (5.10),  we  see  that  the  soft-decision  decoder  has  between 
2  and  3  dB  better  performance  than  the  hard-decision  decoder.  This  is  a  famUiar  result 
(See  [64]  or  [67]) 

With  the  single-user  ACG  clearly  defined,  we  may  now  move  on  to  examine  the 
AMCG  for  soft-decision  partitioned  multiuser  receivers.  Consider  the  system  shown  in 
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Figure  5.2  again.  If  the  deinterleaved  outputs  of  the  multiuser  receiver  are  now  con¬ 
sidered  to  be  soft  ou^uts  denoted  by  for  the  user’s  code  bit  9  in  the  inter¬ 
val,  then  the  first  question  to  be  asked  is,  "What  is  the  structure  of  the  optimal  subsequent 
decoders?"  If  is  conditionally  Gaussian  given  the  information  bit  sequence  for 
user  K  then  the  appropriate  decoding  strategy  for  sequences  over  a  decoding  window 
It  =  1*0  to  1*0  +r  is  to  use  a  Viterbi  algorithm  with  the  following  correlation  metric 

lo+r  Q 

A(?t  !/)*)=  Z  (5-19) 

n=/o5=l 

where  %  is  the  deinterleaved  sequence  of  soft  decision  outputs  of  user  i^’s  multiuser 
receiver,  is  the  sequence  of  transmitted  code  symbols  for  user  k.  Note  that  to  main¬ 
tain  consistency  with  the  "horizon"  used  in  Chapter  4, 1 0  =  ai-M)  and  F  =  a(My-i  q. 

The  notation  used  in  equation  (5. 19)  is  going  to  become  overly  complex  later  and  so 
we  will  simplify  this  equation  by  defining  a  modulo  Q  decomposition  of  an  index,  j,  in 
the  same  way  that  we  defined  the  modulo  K  decomposition  in  Chapter  4.  In  this  way,  we 
can  write  (5.19)  with  a  single  sum  which  accumulates  all  j2  of  the  code  bits  for  each 
interval,  n,  for  user  k. 

io+er 

m\Dk)=  X  ykpkj  (5.20) 

y=«o 

In  this  equation,  y^j  =  yl“^^^(P0')).  ^kj  =  ^1“^^^(P(/)).  and  j  =  a(j)Q + P0’)“1-  (Note  that 
we  assume  that  I’o  =  P(/o)  without  a  loss  in  generality) 

The  metric  for  any  competing  sequence  in  the  trellis  Djt  +  will  be 

_  ^o+fir 

A(y*ID*  +  2?*)=  'Z  yi^i^kj  +  2etj]  (5.21) 

J=io 

where  i*  is  user  k's  subsequence  of  e  from  Chapter  4,  and  e^j  =  eJP^^HPC/))-  It  thus  fol¬ 
lows  that  the  two-sequence  error  probability  will  be  given  by 

_  ^  <0  +  fir 

i>2(?i)  =  ?(A(5iilD,  +  2ii)>A(yjlD,)]  =  P[  2  ykj‘kj>0l 

7=»o 


(5.22) 
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where  P2(fik)  denotes  the  two-sequence  error  probability  for  sequences  differing  by  cjt, 
and  it  is  assumed  that  the  nonzero  portion  of  the  error  sequence  {ckj}  occurs  in  the  region 
^Iq  +  QT. 

To  proceed,  we  need  to  be  able  to  characterize  ykj-  To  do  this,  let 

(5.23) 

where  is  the  noise  (or  perturbation  in  general)  for  user  k's  code  bit  a(/)  in  the  P(/)^* 
interval  after  deinterleaving  the  soft-decision  multiuser  receiver  outputs.  The  charac¬ 
teristics  of  the  noise  will  depend  on  the  iimer  multiuser  receiver  in  use. 

5J.1  The  Conventional  Receiver 

With  this  generic  description  of  the  inputs  to  the  outer  Viteibi  decoders,  we  may 
now  consider  a  number  of  special  cases  for  specific  soft-decision  multiuser  inner 
receivers.  One  of  the  most  important  special  cases  of  the  soft-decision  partitioned 
receiver  is  the  conventional  receiver.  This  receiver  essentially  uses  a  degenerate  mul¬ 
tiuser  receiver  which  simply  passes  die  matched  filter  outputs  on  to  the  outer  Viterbi 
algorithms  without  altering  them  in  any  way,  except  possibly  descrambling  them  in  a 
deinterleaver.  For  the  conventional  receiver,  each  input  to  die  outer  Viteibi  algorithms 
corresponds  to  a  desired  part,  and  a  noise  part,  corresponding  to 

Nkj  =  RMUIkj  +  Zkj  =  MUhj  +  Zkj  (5.24) 

RMUIkj  denotes  the  residual  MUI,  which  for  the  conventional  receiver  is  equal  to  the 
MUI  on  the  matched  filter  output,  and  Zkj  denotes  the  Gaussian  portion  of  the  noise.  It  is 
worth  noting  that  this  overall  noise  is  not  Gaussian,  and  so  the  use  of  the  correlation 
metric  in  the  outer  Viterbi  decoders  is  not  optimum.  The  noise  is  Gaussian  conditioned 
on  the  signal  plus  interference,  however  the  outer  Viterbi  algorithms  do  not  condition  on 
the  interference  due  to  the  other  users.  An  interesting  research  topic  might  be  to  derive 
the  metric  for  the  outer  decoders  which  is  optimal  for  the  noise  due  to  the  sum  of  the 
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Gaussian  noise  and  the  interference.  We  will  not  proceed  in  this  direction,  but  will  sim¬ 
ply  note  the  suboptimality  of  the  adopted  metric.  It  is  worth  noting  that  it  is  common  to 
appeal  to  the  Central  Limit  Theorem  to  claim  that  the  noise  is  approximately  Gaussian. 
This  leads  to  the  claim  that  the  correlation  metric  is  appropriate  for  the  outer  decoders. 
The  Central  Limit  Theorem  leads  to  misleading  and  overly  optimistic  conclusions  in 
many  cases,  however. 

Because  the  noise  statistic  in  (5.24)  will  be  Gaussian  conditioned  on  a  given 
sequence  of  the  desired  user  and  the  interferers,  we  could  obtain  a  performance  estimate 
by  averaging  with  respect  to  all  possible  sequences.  This  performance  will  be  asymptoti¬ 
cally  determined  by  the  worst-case  interference  case.  As  a  result,  we  may  bound  the  resi¬ 
dual  MUI  by  its  effective  worst  case  value  for  completely  unconstrained  interferers  to 
obtain  a  lower  bound  on  the  AMCG  of  the  conventional  receiver.  The  worst  case  value 
of  RMUIkj  when  the  constraints  on  the  other  user’s  transmitted  code  sequences  are  taken 
into  account  will  be  no  greater  than  (and  most  often  lower  than)  the  value  assuming 
unconstrained  sequences.  If  interleaving  is  used  on  the  link,  then  the  interference  patterns 
will  be  closer  to  unconstrained  interference  patterns  since  the  interleaving  will  effectively 
break  up  die  code’s  constraints.  Another  point  worth  noticing  is  that  according  to  (5.24), 
the  noise  on  the  receiver  outputs  is  the  same  as  the  noise  on  the  matched  filter  outputs, 
albeit  potentially  scrambled  from  the  deinterleaving  process.  The  noise  sequence 
{Zkj }  is  white,  and  the  deinterleaving  will  not  affect  this. 

With  this  characterization  of  we  may  proceed  to  substitute  (5.24)  and  (5.23)  into 
(5.22). 

io-hsr  io+er  _ _ 

P2(fk)  =  n  S  ekjktj>-  2  (5-25) 

j=io  7=io 


Next  define 

io+fir 

P~  2  ^kjZkj 

;=»o 


(5.26) 
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Note  that  whenever  €jy^0,  then  ekj  =  -Dkj  (since  Dkj+lekj  must  be  a  valid  error 
sequence).  Making  these  substitutions  we  get 

io+QT 

Pi<fk)  =  P&>  z  (5.27) 

Mo 


Next,  replace  ekfMUIkj  by  its  largest  possible  value  for  completely  unconstrained 
interferers, 


eicjMUIkj  ^  (ekj)^ 


ZiPta(i)i^&+ZiPii».«»i’S’+  Z  Ipto.c-^I'S’ 


m=l 


m=it+l 


(5.28) 


Thus 

io+fir 

P2(«j)£P[p>  s  (5.29) 

j=io 


where 


*zipi»(i)i'J5r+ zipt"(o)iA/gr+  z  ipfc.(-i)ivi7 

ms=l  #ft^  m=Jt+l 


(5.30) 


We  may  next  note  that 


io+QT  ^ 

»■»[?»)=  Z  (et))  (531) 

Ho 


so  (5.29)  becomes 


P2(ejt)  ^  P[p  >  wr  [ilfclYfc] 


(5.32) 


Because  fi  is  a  linear  combination  of  white  Gaussian  random  variables  of  zero  mean  and 
variance  No  /2,  it  is  not  difficult  to  show  that  E  [P]  =  0,  and  £  [P^]  =  wt  |7*]'No  /2.  It  fol¬ 
lows  that 


P2(fik)^Q 


i 


2Etk  yj 
No  'Etk 


’Wt\ek] 


for  Yit  ^  0 


(5.33) 


which  implies  that 
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tI 

niCek)  ^  ^k]  for  Yfc  >  0  (5.34) 

^bk 

Since  y*,  which  was  defined  in  (5.30),  may  be  negative,  we  may  generalize  the  bound  in 
(5.34)  to  be 

Ti|(e*)  ^  max^  { 0,  yk(yvt  [iitl/Sfcjfc)’"^  (5.35) 

Finally,  since  E^k  =  Ej^/Rc,  we  note  that 

r\%^_ inf  Ti^(ejt) ^ max^ {  0,  ykidfRc/EjS^  }  (5.36) 

vdid 


Stated  in  a  different  form  we  have  the  final  bound 


^max^{0.(d/ltc)'"[l-  2  lpto.(l)l 
m=l 


Ek 


-Xlptm(0)l 


1V4 


Ek 


-  z  Ip*»(-I)i 

mHt+l 


Ek 


]} 


For  the  2-user,  Rc  =  Vi  case,  this  bound  takes  ifae  form 


“Hi, win  ^ 


Otherwise 


(5.38) 


(recall  that  Ip2i(0)l+  Ip2i(l)l)  This  lower  bound  on  the  AMCG  for  the  conven¬ 
tional  receiver  is  potentially  loose  if  the  coding  imposes  severe  restrictions  on  the  allow¬ 
able  interfering  sequences.  This  is  due  to  the  fact  that  the  worst-case  allowable  interfer¬ 
ing  sequence  may  be  much  less  severe  than  the  unconstrained  worst  case  sequence. 
Nonetheless,  with  good  interleaving,  we  believe  that  it  will  be  a  reasonable  approxima¬ 
tion. 

The  bound  in  (5.38)  is  plotted  in  Figures  5.7, 5.8  and  5.9  for  the  C  =  0.4, 0.6  and  0.8 
cases  respectively.  From  these  figures,  we  see  that  as  the  energy  of  the  interferer  grows, 
the  AMCG  of  user  I’s  conventional  receiver  drops  to  zero.  This  zero  AMCG  typically 
implies  that  the  receiver  will  have  a  performance  floor. 
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The  TCB  and  decoding  delay  of  the  conventional  receiver  with  soft-decisions  will 
be  the  same  as  those  of  the  partitioned  hard-decision  receiver  with  a  conventional  inner 
receiver. 

5.2.2  Soft-Dedsion  Partitioned  Receiver  With  A  Linear  Multiuser  Recdver 

Another  interesting  class  of  partitioned  receivers  is  those  with  a  linear  inner  mul¬ 
tiuser  receiver,  [44].  The  most  well  known  members  of  the  class  of  linear  multiuser 
receivers  are  the  deconelator,  [4]  and  the  minimum  mean  squared  error  (MMSE) 
receivers  [17],  [21],  [24]  and  [13].  In  addition,  there  are  a  number  of  other  linear 
receivers  diat  have  appeared  in  the  literature,  including  the  optimal  near-far  resistant 
linear  receiver  in  [4]. 

In  this  section,  we  will  focus  on  the  decorrelator  as  a  representative  of  this  class 
because  it  leads  to  a  tractable  analysis.  This  multiuser  receiver  has  the  property  that 
RMUIkj  =  0  at  the  decorrelator  output,  but  the  variance  of  Zkj  will  generally  be  larger 
than  that  of  due  to  this  receiver’s  noise  enhancement  property.  For  this  receiver,  we 
may  write  the  two-sequence  error  probability  as 


«o+er  _ 

PzC^Jt)  =  ^[  2)  «ifc/(D*y  +4j)  >  0  ] 

j^O 

(5.39) 

in+fir  lo+fir  _ 

j=io  "  J-io 

(5.40) 

If  we  next  notice  that  Djy  =  -cjy  whenever  e^j  *  0,  use  (5.31),  and  redefine  P  for  this  sec¬ 
tion  in  the  same  fashion  as  in  (5.26)  to  now  be 

«o+fir 

P=  2  (5.41) 

j=io 

then  we  rewrite  (5.40)  as 


(5.42) 
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Figure  5.7  Plot  of  AMCG  for  the  2-user,  p2i  (0)  =  p2i  (1)  =  0.2  case  where  both  users  employ  a  rate  1/2 
4-state  code  with  df  -  5.  The  ACG  for  a  single-user  system  using  this  code  is  10  log  (1.5)  dB  for  a  hard- 
decision  decoder  and  10  log  (2.5)  dB  for  a  soft-decision  decoder.  Lower  bounds  are  shown  as  solid  lines, 
upper  bounds  as  dashed  lines,  and  the  partitioned  hard-decision  approaches  are  shown  as  dotted  lines  for 
comparison  (from  Figure  5.3). 
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AMCG 


Figure  5.8  Plot  of  AMCG  for  the  2-user,  P2i(0)  =  P2i(l)  =  0.3  case  where  both  users  employ  a  rate  1/2 
4-state  code  with  df  -  5.  The  ACG  for  a  single-user  system  using  this  code  is  10  log  (1.5)  dB  for  a  hard- 
decision  decoder  and  10  log  (2.5)  dB  for  a  soft-decision  decoder.  Lower  bounds  are  shown  as  solid  lines, 
upper  bounds  as  dashed  lines,  and  the  partitioned  hard-decision  approaches  are  shown  as  dotted  lines  for 
comparison  (from  Figure  5.4). 
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■'^eTeT 

Figure  5.9  Plot  of  AMCG  for  the  2-user,  pn  (0)  ~  P2i(l)  =  0.4  case  where  both  users  employ  a  rate  1/2 
4-state  code  with  dj  =  5.  The  ACG  for  a  single-user  system  using  this  code  is  10  log  (1.5)  dB  for  a  hard- 
decision  decoder  and  10  log  (2.5)  dB  for  a  soft-decision  decoder.  Lower  bounds  are  shown  as  solid  lines, 
upper  bounds  as  dashed  lines,  and  the  partitioned  hard-decision  approaches  are  shown  as  dotted  lines  for 
comparison  (from  Figure  5.5). 
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P  is  a  linear  combination  of  Zk/s  which  are  each,  in  turn,  a  linear  combination  of  the 
matched  filter  output  noises,  }.  It  is  easy  to  show  that  £[P]  =  0.  The  computation  of 
the  second  moment  of  P  requires  more  work,  however. 


io+fir  io+QT 

£[p^]=£[  2  ^kjZkJ  S 
J=io  P=io 

io+QT  io+QT 

=  2  2 

j-io  /»=»o 


Next,  define 


^0 

E[Zkj^]  =  —^kk(p-j) 


Using  this  nomenclature,  we  may  proceed  to  rewrite  (5.42)  as 

P2(ek)==Q 


VEk'yvt[ei,f 

Em 


(5.43) 

(5.44) 


(5.45) 


(5.46) 


TiT^kj^kp  ^kk(P  -j) 
J  P 


SO  it  is  easy  to  see  from  (5.47)  that 


T\k(ek)  = 


RcWtlck}^ 

IZ^kjekp^kkip-j) 

j  P 


_ RcWt[et]^ _ 

io+QT  QT 

Wt[«ifc]®Jkk(0)  +  2  2^  'L^kj^kj-l^kkii) 
J=io  /=! 


(5.47) 


(5.48) 


(5.49) 


Now,  all  that  remains  to  be  done  to  obtain  numerical  results  is  to  evaluate  We 

will  do  this  for  the  2-user  case.  If  p(z)  denotes  the  multiuser  system  channel  transfer 
function  matrix  (see  Section  2.1.2),  and  so  p"^^)  denotes  the  decorrelator’s  transfer 
function  matrix,  then  for  the  2-user  case  we  have,  [4] 
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p~^(z)  = 


1 


1“P12“P21”P12P21Z”P12P212 


-1 


1  -(-Pl2+P2lZ 

■(P12+P21Z)  1 


(5.50) 


so 


P*ifc(2)  = 


1 


l“Pl2“P21”Pl2p2l2”Pl2P2lZ 


-1 


forite{l,2} 


(5.51) 


Taking  the  inverse  Z-transform  of  this  polynomial,  we  obtain 

(l~Pl2“P21“V[l“(Pl2+P2l)^][l“(Pl2~P2l)^])'^ ' 


4>tt(0=zr'[piJ(z)]= 


(2pi2p2l)'* '  'N/[1-(Pi2+P2i)^][1”(P12”P2i)^1 
where  =  P2i(0),  and  P21  =  p2i(l).  [4] 

We  may  write  the  denominator  of  (5.48)  as  a  matrix  quadratic  form 


Rc'wt^et^ 
ek  ^kk  «* 


where  is  s  positive-definite  Toeplitz  matrix  with  elements 


(5.52) 


(5.53) 


[<i>kk]ij  =  ^kk(H)  (5.54) 

Next,  if  we  note  that  the  Rayleigh  product  may  be  upper  bounded  by  the  largest  eigen¬ 
value  of  Oifcifc,  namely  Xmax. 


^kk  ^k 
-T- 


- 


ax 


(5.55) 


_ *p _ 

and  we  note  that  Ck  =  wt  [ijt],  then  we  obtain  the  following  lower  bound 


^k(ek)^ 


Xmax 


(5.56) 


A  lower  bound  on  the  AMCG  would  then  be  the  following 

a  ^  .  Rc^t\ek]  Rcdf 

nlmin^  mm  — r - = -r - 

'Wax  'Wax 

Because  Xmax  depends  on  the  dimensions  of  Jk  and  ^kk^  must  note  that  this 


(5.57) 


97 


eigenvalue  incieases  as  the  dimension  of  and  increases.  It  thus  follows  that  the 
tightest  bound  will  be  obtained  by  using  the  eigenvalue  corresponding  to  the  smallest 
permissible  dimension,  which  is  6  for  the  rate  1/2  4-state  code  we  are  considering  as  our 
example.  This  leads  to  X^ax  =0-5228  for  the  case  where  p2i(l)  =  P2i(0)  =  0.3.  For  this 
case,  equation  (5.57)  yields  the  bound  Tif  ^  1.6417. 

We  may  obtain  a  slightly  tighter  lower  bound  on  the  AMCG  via  the  following  pro¬ 
cedure.  Note  that  0^(1)  has  the  property  that  <b;tfc(0)  >  Ojufe(l)  >  <bjfefe(2)  >...  so  die 
second  term  in  the  denominator  of  (5.49)  may  be  upper  bounded  as  follows 

io-*QT  QT 

j=io  /=!  /=! 


and  this  allows  us  to  bound  the  expression  in  (5.49)  by 


r\Wk)'^ 


wrptl-l 

H’r[^*]^‘«(0)  +  2  X  ^kkilKwt\ek]-l) 
/=! 


=g(/?c,wr[?it],pi2,P2i)  (5.59) 


This  is  a  bound  on  tliCek)  which  is  only  a  function  of  wt  rather  than  the  actual  error 
sequence  it-  Using  the  fact  that  this  expression  is  monotonically  increasing  widi  wr[i)fc], 
we  can  obtain  a  lower  bound  on  the  AMCXj  for  this  partitioned  receiver. 


min  ^  min  g(^c.wt[i*],pi2,p2i)  = 


d/s>aim*2^<s>kkmdri) 

/=! 


(5.60) 


As  an  example,  for  the  case  where  p2i  (1)  =  P21  (0)  =  0.3  we  get  q  f  ^  1.67.  This  result  is 
tighter  than  the  bound  which  used  the  eigenvalue  bound  on  the  Rayleigh  product  and  it  is 
plotted  in  Hgure  5.8.  The  results  for  P21  (0)  =  P21  (1)  =  0.2  and  0.4  are  plotted  in  Figures 
5.7  and  5.9  respectively. 

Another  interesting  result  of  (5.60)  is  that  it  provides  a  tightening  of  the  lower 
bound  on  the  AMCG  of  the  MLSE  of  Chapter  4.  This  is  due  to  the  fact  that  the  MLSE, 
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which  is  the  optimal  sequence  estimator,  will  have  a  higher  AMCG  than  the  partitioned 
soft-decision  deconelator,  which  is  a  suboptimal  sequence  estimator.  This  tightening  of 
the  bound  on  the  MLSE’s  AMCG  is  incoiporated  into  Figures  5.7  -  5.9. 

An  upper  bound  on  the  AMCG  for  the  soft-decision  partitioned  decorrelator  may  be 
obtained  by  performing  a  non-exhaustive  search  over  the  set  of  possible  valid 
sequences.  One  valid  error  vector  for  the  standard  four-state  rate  1/2  code  is  the  follow¬ 
ing,  =  (110111).  This  error  vector  gives  the  smallest  result  of  equation  (5.53)  of  those 
tested.  Because  the  actual  minimum  of  equation  (5.53)  over  the  set  of  all  valid  mor 
sequences  must  be  no  larger  than  die  minimum  of  (5.53)  over  the  small  set  of  sequences 
that  we  tested,  we  have  an  upper  bound  on  the  AMCG.  This  upper  bound  turns  out  to  be 
quite  close  to  the  lower  bound  we  have  already  obtained,  so  we  have  a  very  accurate  pic¬ 
ture  of  the  partitioned  decorrelator’s  performance.  This  upper  bound  is  also  plotted  in 
Figures  5.7  -  5.9. 

The  complexiQr  of  die  soft-decision  and  hard-decision  partitioned  receivers  with  a 
decorrelating  inner  receiver  will  be  the  same,  as  will  the  decoding  delay.  This  is  a  result 
of  the  fact  that  die  only  difference  between  the  two  is  that  the  soft-decision  version  does 
not  make  a  hard  decision  on  the  decision  statistics  before  passing  them  to  the  outer 
Viterbi  decoders,  and  the  outer  decoder  metrics  will  differ  but  be  of  the  same  complexity 
order. 

5,2.3  Soft-Decision  Partitioned  Receiver  with  a  Trellis-Based  or  Tree-Based  Inner 
Receiver 

A  third  approach  which  potentially  will  have  the  highest  performance  of  any  parti¬ 
tioned  approach  is  the  one  which  uses  a  soft-decision  trellis-based  or  tree-based  receiver 
as  the  inner  multiuser  receiver.  There  are  a  number  of  possible  trellis-based  receivers, 
the  most  important  of  which  are  the  ML  sequence  estimator,  [1],  and  the  reduced  state 
sequence  estimator  (RSSE),  [8],  [56],  As  an  example  of  a  tree-based  receiver,  see  [14]. 
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In  their  standard  form,  each  of  these  receivers  output  hard-decisions,  and  so  the  tech¬ 
niques  of  [51]  and  [52]  must  be  applied  to  allow  the  inner  receiver  to  supply  soft  outputs 
to  the  outer  Viterbi  decoders. 

This  approach  has  recently  been  proposed  by  two  research  groups,  [53],  and  [16]. 
Both  groups  cite  the  prohibitive  complexity  of  the  full  ML  sequence  estimator,  and 
examine  RSSE  and  sequential  decoding  alternatives. 

The  computation  of  the  AMCG  for  these  approaches  remains  an  open  problem.  The 
AMCG  of  the  soft-decision  partitioned  receiver  with  an  ML  sequence  estimator  should 
have  a  higher  AMCG  than  any  other  partitioned  receiver,  since  the  ML  sequence  estima¬ 
tor  is  the  optimum  inner  receiver  in  the  sequence  error  probability  sense.  It  follows  that 
RSSE  and  sequential  decoding  approaches  which  do  not  suffer  significantly  in  perfor¬ 
mance  relative  to  the  MLSE  will  also  have  a  high  AMCG.  The  interested  reader  is 
referred  to  [16]  and  [53]  for  more  detail  on  these  particular  approaches. 

5.2.4  Soft-Decision  Partitioned  Receiver  With  a  DFE  Inner  Receiver 

In  this  approach,  a  multistage  DFE  operates  on  the  set  of  matched  filter  outputs,  by 
making  tentative  decisions  and  feeding  these  decisions  back  to  make  estimates  of  the 
MUI  which  will  be  subtracted  from  other  matched  filter  outputs.  The  key  in  the  soft  code 
symbol  DFE  (SCS-DFE)  approach  is  that  at  the  final  stage,  the  MUI  estimate  will  again 
be  subtracted  from  the  delayed  matched  filter  output,  but  no  hard-decision  making  will  be 
performed.  Instead,  the  modified  matched  filter  ouq)ut  will  be  passed  straight  to  the 
Viterbi  decoder. 

At  this  point,  we  have  not  committed  to  a  particular  type  of  multistage  DFE.  As 
discussed  in  Chapter  3,  there  have  been  at  least  six  architectures  proposed  in  the  litera¬ 
ture  for  asynchronous  CDMA  links,  [6],  [13],  [22]-[24],  [34],  [20]  and  [45],  and  there  has 
also  been  some  work  on  improving  the  decision  making  procedure  of  the  algorithms, 
[28].  The  improved  decision  making  procedures  in  [28]  can  easily  be  applied  to  any  of 
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the  basic  aichitectuies. 

The  performance  of  this  class  of  approaches  is  difficult  to  evaluate  analytically,  due 
to  the  presence  of  error  propagation.  Expressions  for  the  AME  of  the  Varanasi  DFE  have 
recently  been  reported  in  [29],  but  the  approaches  used  to  get  those  AME  expressions  do 
not  easily  generalize  to  the  coded  link  case.  It  is  possible  to  evaluate  the  AMCG  under 
the  assumption  of  correct  feedback.  However  this  implies  that  as  long  as  the  correlation 
parameters  and  energies  have  been  perfectly  estimated,  RMUI  =  0.  Since  there  is  no  resi¬ 
dual  interference  if  the  feedback  is  correct,  and  there  is  no  noise  enhancement,  as  in  the 
case  of  the  decorrelator,  we  obtain  the  result  that  the  AMCG  is  that  of  a  single-user  sys¬ 
tem.  Clearly,  the  presence  of  error  propagation  degrades  the  AMCG  by  some  amount,  so 
the  computation  of  the  actual  AMCG  remains  an  open  question.  Consequently,  simula¬ 
tion  will  be  the  performance  evaluation  technique  for  this  class  of  receiver. 

Because  the  structure  of  the  integrated  DFE  which  will  be  discussed  in  Section  5.3 
is  most  like  the  Varanasi  style  uncoded  link  multistage  decoder,  it  is  interesting  to  com¬ 
pare  the  integrated  DFE  with  the  Varanasi  style  SCS-DFE.  hi  addition,  because  it  was 
shown  in  Chapter  3  that  in  most  cases,  the  Hybrid  DFE  outperforms  the  other  two  archi¬ 
tectures  on  an  uncoded  link,  it  is  an  obvious  candidate  for  use  in  an  SCS-DFE  structure. 
Thus,  the  structures  that  were  simulated  were  the  Hybrid  and  Varanasi  versions  of  the 
SCS-DFE.  The  modifications  to  the  decision  making  devices  in  each  preliminary  stage 
of  the  multistage  decoders  discussed  in  [28]  were  not  considered  here,  aldiough  those 
modifications  may  provide  improvements  in  some  cases. 

Figure  5.10  shows  the  performance  curves  for  the  "0.2  charmel"  illustrated  by  Fig¬ 
ure  3.4a.  As  Figure  5.10  illustrates,  the  conventional  decoder  suffers  about  a  3  dB  loss  at 
average  =  2-10“^  relative  to  the  performance  of  the  same  receiver  operating  in  die 
absence  of  MUI.  For  this  case,  the  Hybrid  version  of  the  SCS-DFE  outperforms  the 
Varanasi  version  of  the  SCS-DFE.  This  is  similar  to  the  results  obtained  on  the  uncoded 
link  simulated  in  Chapter  3. 
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Figure  5.11  shows  the  performance  for  the  more  severe  channel  illustrated  in  Figure 
3.4b.  In  this  figure,  it  is  evident  that  all  of  the  decoders  perform  significantly  worse  than 
in  the  ”0.2  channel"  due  to  the  more  severe  MUI.  In  addition,  the  hybrid  SCS-DFE  again 
outperforms  the  Varanasi  DFE. 

Once  again,  the  complexity  and  decoding  delay  of  the  soft-decision  partitioned  DFE 
receiver  will  be  of  the  same  order  as  in  die  hard-decision  case. 

S3  Combined  Equalization  and  Decoding  Approaches 

In  aU  of  the  partitioned  approaches,  regardless  of  the  ^e,  the  multiuser  receiver 
operates  at  the  code  symbol  level  as  though  there  were  no  coding  on  the  link,  and  then 
passes  its  decisions  or  improved  statistics  to  an  outer  decoder.  The  deficiency  with  this 
approach  is  that  separating  the  functions  of  cancelling  the  MUI  and  decoding  the  mes¬ 
sage  does  not  take  full  advantage  of  the  coding  on  the  link.  The  approaches  discussed  in 
this  section  attempt  to  alleviate  this  shortcoming. 

53.1  Linear  Combined  Equalization  and  Decoding  Approadies 

A  linear  approach  could  be  defined  as  any  approach  which  somehow  forms  decision 
statistics  for  the  information  symbols  using  a  linear  method,  ie.  by  linearly  combining 
matched  filter  ouq}uts,  or  more  generally  by  performing  linear  operations  on  the  received 
waveform  without  the  use  of  matched  filters  at  all.  For  any  linear  receiver,  after  decision 
statistics  have  been  formed,  a  decision  must  be  made  to  determine  the  estimated  bit  This 
decision  making  procedure  will  typically  be  nonlinear.  One  example  of  decision  making 
procedure  would  be  the  comparison  of  a  decision  statistic  to  a  threshold  and  output  of  a 
corresponding  bit  (i.e.  the  signum  function)  Another  example  would  be  a  decision  maker 
which  chooses  the  largest  of  a  set  of  decision  statistic  and  outputs  the  corresponding 
symbol  or  symbols. 
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Figure  5.10  Performance  curves  of  the  soft-decision  partitioned  decision  feedback  receivers  for 
the  4-user  0.2  channel  illustrated  in  Figure  3.4a.  The  solid  lines  show  a  single  user  system  (no 
MUI)  with  and  without  the  rate- 1/2  4-state  convolutional  code.  Also  shown  are  the  one,  two  and 
three  stage  soft  code  symbol  DFEs  for  both  the  Varanasi  (dashed)  and  Hybrid  (dotted)  architec¬ 
tures.  Note  that  the  Varanasi  style  one-stage  soft  code  symbol  DFE  is  equivalent  to  the  conven¬ 
tional  receiver. 
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Figure  5.11  Performance  curves  for  the  various  soft-decision  partitioned  decision  feedback 
receivers  for  the  more  severe  4-user  0.25  channel  illustrated  in  Figure  3.4b.  The  solid  lines  show 
a  single  user  system  (no  MUI)  with  and  without  the  rate-1/2  4-state  convolutional  code.  Also 
shown  are  the  one,  two  and  three  stage  soft  code  symbol  DFEs  of  both  the  Varanasi  (dashed)  and 
Hybrid  (dotted)  architectures. 
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The  optimal  receiver  in  terms  of  sequence  error  probability  may  be  implemented  in 
its  most  general  form  as  a  bank  of  correlators  each  of  which  correlates  the  received 
sequence  with  a  waveform  corresponding  to  a  different  transmitted  information  bit 
sequence.  The  decision  maker  would  then  simply  choose  the  sequence  of  information 
bits  for  each  user  which  corresponded  to  the  largest  correlator  output  If  the  horizon  of 
the  transmission  is  2M  1  as  in  Chapter  4,  then  there  would  need  to  be  2^'*'^  correlators 
assuming  that  each  user  sends  binary  data.  For  even  a  fairly  small  horizon,  the  number 
of  correlators  would  be  prohibitively  high.  We  saw  in  Chapter  4  that  this  same  ML 
sequence  estimator  may  be  implemented  using  a  trellis-based  receiver  whose  complexity 
did  not  depend  on  the  horizon  size. 

The  fact  that  the  ML  receiver  can  be  implemented  in  a  linear  form,  however,  is 
significant  because  it  raises  the  possibility  of  forming  suboptimum  receivers  which  are 
linear  and  have  a  lower  complexity  fiiat  the  optimal  linear  receiver.  The  question 
remains  open,  however,  as  to  whether  the  complexity  could  ever  be  lowered  to  an  imple- 
mentable  level.  Clearly,  it  would  be  desirable  to  develop  a  linear  receiver  which  linearly 
combined  the  matched  filter  ouq)uts  to  form  its  decision  statistics,  rather  than  having  to 
build  a  large  number  of  correlators.  The  convolutional  code  used  by  each  of  the  users  is 
a  linear  code,  and  thus  may  in  some  cases  be  invertible  using  a  stable  (and  maybe  even 
causal)  linear  receiver.  The  determination  of  the  exact  structure  of  this  combined  equali¬ 
zation  and  decoding  linear  receiver  with  a  reasonable  complexity  remains  an  open  ques¬ 
tion  at  this  point,  however. 

53.2  Trellis-Based  and  Tree-Based  Combined  Equalization  and  Decoding 
Approaches 

The  MLSE  of  Chapter  4  is  a  trellis-based  approach  which  combines  the  functions  of 
equalization  and  decoding  into  one  operation.  This  approach  is  the  optimal  sequence 
estimator.  In  this  section,  we  briefly  discuss  the  notion  of  a  suboptimum  trellis-based  or 
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tree-based  receiver,  which  was  alluded  to  in  Chapter  4. 

The  MLSE  has  a  prohibitively  high  number  of  states,  thus  it  is  natural  to  look  to 
amplify  the  decoding  process  by  combining  states  in  some  fashion.  This  approach  of 
combining  states  in  some  fashion  is  what  is  referred  to  as  reduced  state  sequence  estima¬ 
tion  (RSSE).  There  are  a  number  of  publications  on  the  application  of  RSSE  to  simplify 
the  MLSE  for  the  uncoded  case,  [8],  [56],  as  well  as  at  least  one  other  on  die  application 
of  RSSE  to  simplify  the  process  of  equalizing  and  decoding  a  coded  signal  on  a  single- 
user  dispersive  link,  [47].  It  is  undoubtedly  possible  to  apply  these  techniques  to  the 
problem  at  hand  to  obtain  a  performance  versus  complexity  tradeoff. 

Just  as  in  [14],  it  is  also  presumably  possible  to  use  sequential  decoding  approaches 
to  lower  the  complexity  of  the  MLSE  for  the  coded  link  case.  Sequential  decoding 
would  also  provide  the  opportunity  to  tradeoff  performance  versus  complexity. 

One  problem  with  the  treUis-based  and  tree-based  approaches,  however,  is  that  they 
are  apparently  not  as  robust  to  mismatch  (a  misestimation  of  the  correlation  or  energy 
parameters)  as  some  of  the  simpler  approaches  like  the  partitioned  decorrelator  and  DFE 
approaches.  Li  [27],  it  was  shown  that  the  uncoded  link  MLSE  was  not  as  robust  as  a 
Varanasi  DFE  to  mismatch,  in  the  sense  that  for  even  small  values  of  mismatch,  Ihe 
suboptimum  DFE  ouq)erformed  the  MLSE.  This  high  sensitivity  of  the  optimal  approach 
to  mismatch  is  a  very  undesirable  feature,  and  it  may  very  well  carry  over  to  suboptimal 
approaches  which  are  based  on  the  MLSE  like  the  RSSE  and  sequential  approaches. 
Nonetheless,  given  no  mismatch,  the  trellis-based  and  tree-based  approaches  have  the 
potential  to  perform  nearly  as  well  as  the  MLSE,  possibly  with  a  significantly  decreased 
complexity. 
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5.3.3  The  Decision  Feedback  Combined  Equalization  and  Decoding  Approach:  The 
Integrated  DFE 

The  idea  in  the  approach  we  will  refer  to  as  the  integrated  DFE  is  that  the  MUI  can 
be  more  reliably  estimated  by  exploiting  the  coding.  Thus,  instead  of  using  a  hard- 
decision  device  in  the  first  stage  of  a  multistage  DFE,  like  that  of  [6],  we  may  decode  the 
message  using  a  soft-decision  Viterbi  algorithm  which  operates  on  the  stream  of  matched 
filter  outputs  in  the  channel,  and  then  re-encodes  the  decoded  bits  to  form  estimates  of 
the  code  bits.  These  code  bit  estimates  can  then  be  used  to  estimate  the  MUI  in  other 
user’s  chaimels.  Again,  as  in  the  uncoded  case,  the  decision  feedback  may  be  performed 
for  as  many  stages  as  is  desired.  (See  Figure  5.12)  The  performance  of  this  approach  is 
difficult  to  evaluate  analytically,  again  due  to  the  presence  of  incorrect  feedback.  As  a 
result,  simulation  will  be  the  performance  evaluation  technique  in  tins  section. 

A  characteristic  of  convolutional  codes,  or  most  codes  for  that  matter,  is  that  at  very 
low  signal  to  noise  ratios,  the  coded  link  may  perform  worse  than  an  uncoded  link. 
When  the  agnal  to  noise  ratio  is  in  this  regime,  it  is  possible  that  the  Viterbi  decoder 
whose  outputs  are  re-encoded  to  form  the  ML  estimate  of  the  code  bit  sequence,  may 
perform  worse  than  a  simple  hard-decision  device  operating  on  the  code  bits  without 
regard  to  the  coding.  As  a  result  of  this  characteristic,  it  is  important  that  the  combina¬ 
tion  of  the  thermal  noise  and  MUI  is  not  so  strong  that  the  re-encoded  Viterbi  output 
sequence  is  worse  than  the  estimated  code  bits  of  a  simple  threshold  detector  for  tire 
integrated  DFE  to  outperform  an  SCS-DFE.  Basically,  the  structure  which  provides 
better  estimates  of  the  code  bit  sequence  will  have  a  better  estimate  of  the  MUI  in  tire 
other  channel’s  multistage  decoders.  In  general,  because  coding  generally  allows  better 
estimates  of  the  transmitted  sequence,  it  is  reasonable  to  expect  the  integrated  DFE  to 
outperform  an  SCS-DFE  of  a  similar  architecture  tike  a  SCS-DFE  with  a  Varanasi  DFE. 

Figure  5.13  shows  performance  of  the  integrated  DFE  and  the  various  SCS-DFE 
approaches  on  the  "0.2  charmel"  illustrated  by  Figure  3.4a.  In  this  envirorunent,  the 


Figure  5. 12  The  structure  of  a  3-stage  integrated  DFE  for  the  user  with  Viterbi  algorithms  (denoted  VA) 
at  each  stage.  A  is  the  maximum  delay  in  code  bit  periods  corresponding  to  8  information  bit  periods 
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integrated  DFE  is  able  to  nearly  recoup  all  of  this  loss,  while  the  various  SCS-DFEs  arc 
able  to  only  recoup  some  of  the  loss.  It  may  thus  be  concluded  that  the  thermal  noise  and 
MUI  are  weak  enough  to  be  operating  in  the  regime  where  a  receiver  which  exploits  the 
coding  performs  better  than  one  which  does  not. 

Figure  5.14  shows  the  performance  on  the  more  severe  channel  illustrated  in  Figure 
3.4b.  The  integrated  DFE  still  uniformly  outperforms  the  Varanasi  version  of  the  SCS- 
DFE.  For  these  particular  channel  characteristics,  however,  the  hybrid  SCS-DFE  is  able 
to  outperform  die  integrated  DFE  at  larger  values  of  Ei/Nq.  While  this  may  seem 
surprising  at  first,  it  is  simply  due  to  the  fact  that  even  though  the  hybrid  SCS-DFE  per¬ 
forms  separate  equalization  and  decoding,  it  has  a  high  quality  first  stage  which  is  able  to 
provide  better  code  symbol  estimates  to  the  second  stage  MUI  estimator  than  the  conven¬ 
tional  Viterbi  algorithm  operating  in  the  first  stage  of  the  integrated  DFE.  This  case  illus¬ 
trates  that  when  the  MUI  is  strong  enough,  the  integrated  DFE  will  not  always  ouqter- 
form  a  well  designed  SCS-DFE,  although  it  does  in  most  cases. 

To  compute  the  TCB  for  the  integrated  DFE,  we  again  assume  that  the  computation 
of  the  MUI  in  each  stage  of  the  DFE  structures  is  roughly  equivalent  in  complexity  to  the 
computation  of  one  metric  in  the  Viterbi  decoder.  Thus  adopting  this  convention,  we 
may  conclude  that  for  a  link  with  rate  l/Q  and  constraint  length  W  codes,  the  7-stage 
integrated  DFE  has  a  time  complexity  of  roughly  TCB  =  O ((7-1)2 +72^).  This  is 
significantly  higher  than  that  of  the  SCS-DFE,  although  it  is  far  less  than  that  of  the 
MLSE  of  Chapter  4.  In  die  general  rate  P/Q  code  case,  if  again,  k  =  log25  where  5  is  the 
number  of  states  in  each  user’s  encoder,  the  integrated  DFE  has  TCB  = 
O  ([(7-1)2 +72'''^^]/P).  Again,  because  each  MUI  computation  grows  in  complexity 
with  Kt  we  may  say  that  the  number  of  operations  per  decoded  bit  is  on  the  order  of 
OPidfe  =  0([(7-1)2J5:+72''+^]/P). 

If  it  is  assumed  again  that  the  Viterbi  decoders  used  operate  with  a  decoding  delay 
which  is  typically  on  the  order  of  SW,  then  the  overall  decoding  delay  of  the  7-stage 
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Figure  5.13  Performance  curves  of  the  various  decision  feedback  receivers  for  the  4-user  0.2 
channel  illustrated  in  Bgure  3.4a.  The  solid  lines  show  a  single  user  system  (no  MUI)  with  and 
without  the  rate- 1/2  4-state  convolutional  code.  Also  shown  are  the  one,  two  and  three  stage  soft 
code  symbol  DFEs  for  both  the  Varanasi  (dashed)  and  Hybrid  (dotted)  architectures,  and  a  one, 
two  and  three  stage  integrated  DFE  (dashed  lines).  Note  that  the  Varanasi  style  one-stage  soft 
code  symbol  DFE  and  the  one-stage  integrated  DFE  are  both  equivalent  to  the  conventional 
receiver. 
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Figure  5.14  Performance  curves  of  the  various  decision  feedback  receivers  for  the  more  severe  4-user 
0.25  channel  illustrated  in  Figure  3.4b.  The  solid  lines  show  a  single  user  system  (no  MUI)  with  and 
without  the  rate-1/2  4-state  convolutional  code.  Also  shown  are  the  one,  two  and  three  stage  soft  code 
symbol  DFEs  of  both  the  Varanasi  (dashed)  and  Hybrid  (dotted)  architectures,  and  a  one,  two  and  three 
stage  integrated  DFE  (dashed  lines).  Note  that  the  Varanasi  style  one-stage  soft  code  symbol  DFE  and 
the  one-stage  integrated  DFE  are  both  equivalent  to  the  conventional  receiver. 
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integrated  DFE  is  roughly  5JW. 

We  may  thus  conclude  that  the  integrated  DFE  has  a  larger  TCB,  and  a  longer 
decoding  delay  than  any  of  the  SCS-DFE  approaches,  but  its  performance  is  better  in 
most  cases.  As  a  result,  the  appropriate  choice  of  receiver  configuration  will  depend  on 
the  expected  severity  of  the  channel  and  the  complexity  and  delay  constraints  on  die 
receiver.  As  we  have  seen  in  Figure  5.14,  however,  even  the  integrated  DFE  does  not 
perform  well  when  the  MUI  becomes  too  strong. 

5.4  Comparison  of  tiie  Suboptimum  Approaches 

Li  this  chapter,  a  large  number  of  approaches  have  been  discussed  (see  Figure  5.1). 
The  approaches  can  be  categorized  as  linear,  DFE  and  trellis-based,  as  in  Figure  5.1,  or 
they  may  be  categorized  as  partitioned  and  combined  approaches  as  they  were  presented 
in  this  chapter. 

The  determination  of  which  approach  is  the  best  for  a  specific  situation  is  not  trivial, 
because  there  are  a  number  of  factors  which  must  be  considered.  Throughout  this 
chapter,  the  number  of  arithmetic  operations  per  decoded  bit  was  used  as  a  measure  of 
the  complexity  of  the  receivers.  Figure  5.15  shows  a  table  comparing  the  complexity  for 
some  of  the  approaches  discussed  for  two  specific  cases,  a  2-user  2-state  code  case  and  a 
l(X)-user  64-state  code  case.  It  is  clear  that  for  a  large  number  of  users  and  large  codes, 
the  MLSE  and  Partitioned  MLSE  are  too  complex  to  be  used.  Most  of  the  other 
approaches  are,  however,  fairly  reasonable. 

The  decoding  delay  is  another  factor  which  will  be  of  importance  in  some  applica¬ 
tions,  such  as  voice  communications.  Figure  5.16  shows  a  table  comparing  the  decoding 
delay  of  some  of  the  approaches  discussed  for  the  same  two  cases  as  were  used  in  Figure 
5.15.  The  interesting  feature  of  this  table  is  that  most  of  the  approaches  have  about  the 
same  decoding  delay,  with  the  exception  of  the  Integrated  DFE  approaches.  Even  the 
very  complex  MLSE  has  a  decoding  delay  that  is  the  same  as  the  conventional  receiver, 


112 


Approximate  Number  of  Operations  Per  Decided  Bit 

receiver 

2-user,  2-state  case 

100-user,  64-state  case 

MLSE 

32 

5E212 

Partitioned  MLSE 

20 

1.27E32 

3-stg.  integrated  DFE 

20 

784 

2-stg.  integrated  DFE 

12 

456 

3-stg.  SCS-DFE 

16 

728 

2-stg.  SCS-DFE 

12 

528 

Partitioned  Decor. 

24 

1128 

Conventional 

4 

128 

Figure  5.15  Complexity  comparison  for  two  specific  rate  1/2  code  cases.  The  partitioned 
decorrelator  approach  assumes  an  impulse  response  truncation  depth  of  5  =  5. 
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Decoding  Delay 

receiver 

2-user,  2-state  case 

100-user,  64-state  case 

MLSE 

lOTs 

35Ts 

Partitioned  MLSE 

12.5TS 

37.5TS 

3-stg.  integrated  DFE 

30Ts 

105Ts 

2-stg.  integrated  DFE 

20Ts 

70Ts 

3-stg.  SCS-DFE 

llTs 

1 

36Ts 

2-stg.  SCS-DFE 

10.5TS 

35.5TS 

Partitioned.  Decor. 

11.25TS 

36.25TS 

Conventional 

lOTs 

35Ts 

Figure  5.16  Decoding  delay  comparison  for  the  same  cases  as  in  Figure  5.  I  S. 
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assuming  that  all  of  the  decoding  operations  can  be  completed  in  real  time,  (which  they 
typically  can  not) 

In  all  cases,  the  performance  is  a  very  important  factor  in  the  determination  of 
which  approach  is  the  best  Throughout  this  chapter,  the  various  approaches  were  com¬ 
pared  using  the  AMCG  performance  measure  whenever  possible.  When  the  receivers  did 
not  lend  themselves  to  an  AMCG  analysis,  as  in  the  case  of  the  DFE’s,  computer  simula¬ 
tions  were  used  to  compare  their  performance  to  the  important  baselines,  the  single-user 
bound,  the  MLSE’s  performance  and  the  conventional  receiver’s  performance.  In  some 
cases,  no  performance  analysis  was  given  at  all,  but  the  receivers  were  discussed  briefly 
anyway,  for  the  sake  of  completeness.  For  these  receivers,  the  interested  reader  was 
either  referred  to  references  which  discussed  their  performance,  as  in  the  case  of  the 
soft-decision  partitioned  treUis-based/tree-based  approaches,  or  a  performance  analysis 
has  simply  not  been  done  yet.  For  most  of  the  receivers  in  this  chapter,  however,  the  tools 
have  been  developed  for  comparing  the  various  options. 

An  examination  of  the  figures  of  this  chapter  illustrates  that  the  MLSE  of  Chapter  4 
has  the  highest  performance.  The  worst  performance  of  any  receiver  considered  was  the 
hard-decision  conventional  followed  by  the  soft-decision  conventional  The  soft- 
decision  partitioned  approaches  with  a  decorrelator  or  DFE  inner  receiver  provide  rea¬ 
sonable  performance  and  have  a  fairly  low  complexity.  If  a  higher  complexiQr  and  decod¬ 
ing  delay  is  tolerable,  the  integrated  DFE  will  usually  provide  the  best  compromise  of 
performance  and  complexity. 

Thus  the  basestation  architecture  which  is  most  appropriate  will  depend  on  a 
number  of  different  factors  and  there  is  no  single  correct  solution  for  every  situation. 
Ihis  dissertation  has  presented  the  pros  and  cons  of  each  approach,  however,  so  that  the 
options  may  be  compared  in  light  of  the  constraints  of  a  given  application. 
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Chapter  6  Conclusions 

In  diis  final  chapter,  the  important  results  of  this  dissertation  will  be  summarized. 
The  merits  and  drawbacks  of  the  proposed  basestation  architectures  will  be  discussed. 
Finally  this  chapter  will  conclude  with  some  thoughts  on  questions  that  remain  open  and 
may  warrant  additional  research  in  the  future. 

6.1  Summary  of  Results 

In  this  thesis,  we  considered  direct  sequence  asynchronous  CDMA  systems  with 
and  widiout  coding  on  a  nondi^tsive  AWGN  channel.  The  notion  of  multiuser  detec¬ 
tion,  wherein  a  receiver  jointly  demodulates  all  of  the  users  in  a  CDMA  system,  was  first 
reviewed  and  then  extended  to  CDMA  systems  with  coding.  In  Chtq)ter  2,  after  defining 
the  concept  of  multiuser  detection,  a  survey  of  the  literature  on  the  topic  was  presented. 
The  review  of  the  optimal  receiver  for  the  uncoded  case  led  to  the  study  of  a  wide  variety 
of  suboptimum  approaches  for  the  uncoded  case  which  have  already  been  proposed. 
These  approaches  may  be  broadly  categorized  as  trellis/tree  based  approaches,  linear 
approaches  and  decision  feedback  approaches. 

The  review  of  the  decision  feedback  approaches  was  deferred  to  Chapter  3,  because 
the  discussion  of  the  nonlinear  DFE  approaches  in  [6]  and  [13]  led  to  a  new  hybrid  DFE 
approach.  This  hybrid  DFE  was  shown  through  simulations  to  greatly  outperform  the 
approaches  in  [6]  and  [13]  in  most  cases.  It  was  next  seen  that  the  iq)proaches  HicmROfid 
in  [7],  [22]  -  [24]  and  [28]  could  be  incorporated  into  the  hybrid  design  to  provide  a  DFE 
which  used  the  best  features  of  each  ^proach.  Even  the  approach  discussed  in  [34]  could 
be  viewed  as  special  case  of  the  hybrid  architecture.  Thus  the  major  contribution  of 
Chapter  3  is  the  unification  of  the  different  decision  feedback  approaches  into  a  common 
architecture,  and  the  illustration  that  the  ideas  used  in  each  of  the  previous  DFE’s  ate  not 
mutually  exclusive. 
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We  next  proceeded  to  begin  a  study  of  receiver  architectures  for  CDMA  links  with 
convolutional  coding.  It  was  natural  to  begin  the  study  with  the  optimal  receiver 
(optimal  in  the  sequence  error  probability  sense).  For  the  sake  of  simplicity,  this  max> 
imum  likelihood  sequence  estimator  was  derived  for  the  case  where  each  user  employed 
the  same  rate  1/2  code.  This  derivation  generalizes  easily  to  the  case  where  the  users 
employ  rate  P/Q  codes  and  also  the  case  where  each  user  employs  a  different  code. 
After  the  metric  was  derived  for  the  rate  1/2  case,  it  was  shown  that  the  decoder  may  be 
implemented  using  a  Viterbi  algorithm  which  operates  on  a  time-varying  trellis  with 
states  (recall  that  W  is  the  constraint  length  of  the  codes  and  liT  is  the  number  of 
users  in  the  system).  The  time  complexity  per  decoded  bit  and  rough  number  of  arith¬ 
metic  operations  per  decoded  bit  were  then  determined  for  this  receiver  and  were  seen  to 
be  exponential  in  both  K  and  W.  In  the  general  rate  P/Q  code  case,  the  complexity  was 
exponential  in  P,  and  k,  which  is  the  binary  memory  order  of  the  code. 

A  performance  analysis  was  then  undertaken  for  the  MLSE  and  an  upper  and  lower 
bound  on  the  asymptotic  efficiency  of  this  receiver  relative  to  an  uncoded  coherent  BPSK 
receiver  was  determined.  This  asymptotic  efficiency  was  given  the  name  asymptotic  mul¬ 
tiuser  coding  gain  (AMCG).  It  was  seen  that  the  AMCG  unifys  the  asymptotic  coding 
gain  parameter  and  the  asymptotic  multiuser  efficiency  parameter  which  are  traditional 
figure  of  merit  parameters  for  single-user  coded  systems  and  multiuser  uncoded  systems 
respectively.  The  bounding  procedure  on  the  AMCG  was  used  to  avoid  having  to  solve 
the  NP-hard  problem  of  searching  for  the  valid  error  sequence  which  minimizes  the 
efficiency  equation  (4.42)  over  the  infinite  set  of  all  valid  error  sequences.  Finally,  some 
simulations  were  presented  to  illustrate  the  performance  of  the  MLSE  at  moderate  and 
low  bit  error  rates. 

The  very  high  complexity  of  the  MLSE  illustrated  the  need  for  suboptimum  bases- 
tation  architectures  which  perform  nearly  as  well  as  the  MLSE  with  a  lower  complexity. 
Chapter  5  examined  a  large  number  of  possible  receiver  architectures  which  attempt  to 
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satisfy  this  need. 

The  first  architectures  examined  were  the  partitioned  receivers,  which  treat  the 
equalization  of  the  MUI  and  the  decoding  of  the  code  separately.  These  approaches  may 
be  subdivided  into  hard  and  soft  decision  partitioned  receivers.  In  the  hard-decision  par¬ 
titioned  approach,  an  inner  multiuser  receiver  operates  on  the  received  code  symbol 
sequence  without  regard  to  the  coding  and  then  supplies  hard-decisions  to  a  bank  of  outer 
Viterbi  decoders.  In  the  soft-decision  approach,  the  inner  multiuser  receivers  are 
modified  to  supply  soft-decisions  to  the  outer  Viterbi  decoders.  These  approaches  were 
analyzed  in  terms  of  AMCG,  TCB,  arithmetic  operations  per  decoded  bit  and  decoding 
delay  for  various  inner  receivers.  Because  the  soft-decision  partitioned  receiver  with  a 
DFE  as  the  inner  receiver  did  not  lend  itself  to  an  AMCG  analysis,  a  computer  simulation 
was  used  to  compare  it  to  the  important  baseline  approaches. 

The  next  family  of  suboptimum  approaches  which  were  examined  were  those  which 
combine  the  operations  of  equalization  and  decoding  into  a  single  operation.  The 
integrated  DFE  was  introduced  as  a  DFE  which  estimates  the  MUI  by  exploiting  the  cod¬ 
ing  on  the  link.  This  approach  was  shown  to  perform  better  than  the  partitioned  DFE 
approaches  in  most  cases.  If  the  interference  and  noise  were  so  severe  so  as  to  cause  the 
Viterbi  decoder  to  provide  a  worse  sequence  estimate  than  a  simple  symbol-by-symbol 
detector  which  does  not  consider  the  coding,  then  the  integrated  DFE  does  not  perform  as 
well  as  a  well  designed  partitioned  DFE.  Again  because  of  the  presence  of  error  propa¬ 
gation,  the  integrated  DFE’s  performance  was  estimated  using  a  computer  simulation. 
Also,  short  discussions  were  given  for  suboptimum  combined  equalization  and  decoding 
approaches  which  were  linear  and  trellis/tree  based.  Chapter  5  concluded  with  a  brief 
comparison  of  die  complexities  and  decoding  delays  of  the  various  receivers. 

There  is  no  single  clear  winning  approach,  as  each  provides  advantages  and  disad¬ 
vantages  in  different  situations.  If  an  unlimited  amount  of  processing  power  is  available, 
the  optimal  solution  is  the  clear  winner.  Furthermore,  for  a  network  with  only  a  few  users 
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and  small  codes,  the  complexity  of  the  optimal  receiver  may  be  tolerable,  however  this 
situation  will  probably  be  very  uncommon  in  practice.  If  the  processing  power  of  the 
receiver  is  very  limited,  it  may  only  be  possible  to  use  a  conventional  receiver.  The  con¬ 
ventional  receiver  will  typically  provide  the  lowest  performance  of  any  approach  studied, 
however.  For  situations  where  there  is  enough  processing  power  to  implement  an  inter¬ 
mediate  approach,  we  have  examined  a  range  of  options  in  Chapter  5.  The  partitioned 
approaches  with  soft-decision  iimer  receivers  will  provide  a  relatively  low  complexity 
and  high  performance  in  many  situations.  If  the  processing  power  is  high  enough,  a  com¬ 
bined  equalization  and  decoding  approach  will  generally  outperform  a  partitioned 
approach.  In  many  situations,  the  integrated  DFE  seemed  to  provide  the  best  perfor¬ 
mance  of  the  approaches  considered,  however  its  time  complexity  per  decoded  bit  and 
decoding  delay  are  higher  than  many  of  the  other  approaches.  The  integrated  DFE’s 
complexity  as  measured  in  terms  of  the  number  of  arithmetic  operations  per  decoded  bit 
is  comparable  to  many  of  the  partitioned  approaches,  however.  The  partitioned  SCS- 
DFE  approaches,  particularly  with  a  hybrid  DFE  inner  multiuser  receiver,  are  also  prob¬ 
ably  a  good  compromise  in  many  situations. 

This  dissertation  has  provided  an  introduction  to  a  large  number  of  approaches  and 
their  performance  and  complexity. 


6,2  Future  Research  Possibilities 

There  are  many  possibilities  for  future  research  in  this  area.  This  work  was  con¬ 
cerned  with  CDMA  links  wherein  convolutional  coding  is  employed.  Convolutional 
codes  are  a  logical  choice  of  codes  for  this  situation,  but  there  may  be  links  where  trellis 
coded  modulation  (TCM)  or  block  coding  is  used  instead.  The  extension  to  the  TCM 
case  will  not  be  particularly  difficult.  Formulating  the  optimal  receiver  for  the  block  cod¬ 
ing  case  may  be  more  difficult,  although  many  of  the  suboptimal  approaches  will  easily 
generalize  to  this  case. 
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Some  work  has  been  done  recently  to  develop  expressions  for  the  asymptotic  mul¬ 
tiuser  efficiency  (AME)  of  the  Varanasi  DFE,  [29].  This  work  does  not  extend  easily  to 
the  analysis  of  the  AME  of  the  hybrid  DFE,  however  the  development  of  AME  expres¬ 
sions  for  the  hybrid  DFE  would  be  a  big  contribution.  Furthermore,  it  would  also  be 
worthwhile  to  develop  expressions  for  the  AMCG  of  the  DFE  approaches  discussed  in 
Chapter  5.  This  would  allow  a  direct  comparison  with  the  other  approaches  via  the 
AMCG  performance  measure. 

Another  interesting  extension  of  this  work  is  to  consider  dispersive  chaimels.  There 
has  been  a  significant  amount  of  work  recently  on  the  design  of  multiuser  receivers  for 
dispersive  uncoded  CDMA  systems,  [9],  [31]  -  [33].  Because  many  cellular  charmels  are 
somewhat  dispersive,  it  would  be  worthwhile  to  unify  the  work  in  [9],  [31]  -  [33]  with 
that  in  this  dissertation. 

Finally,  by  no  means  has  this  dissertation  proposed  every  possible  multiuser 
receiver  for  coded  links.  There  are  most  likely  other  solutions  which  have  not  yet  been 
developed  which  may  produce  robust,  low-complexity,  high  performance  basestations. 
This  dissertation  has  addressed  many  multiuser  receiver  architectures,  but  there  may  be 
some  sophisticated  new  solutions  waiting  to  be  discovered  still. 

In  conclusion,  it  is  hoped  that  this  dissertation  has  opened  the  door  to  the  field  of 
multiuser  detection  for  coded  CDMA  systems.  We  have  seen  that  there  exists  great 
potential  for  multiuser  receivers  to  significantly  improve  upon  the  performance  of  the 
conventional  basestation  architecture.  In  the  future,  this  work  will  undoubtably  lead  to  a 
significant  improvement  of  the  capacity  of  CDMA  networks  and  should  help  illustrate 
that  CDMA  is  a  very  attractive  and  wordiwhile  technique  for  allowing  many  users  to 
share  the  crowded  spectrum. 
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Abstract 

Motivated  by  the  high  complexity  of  the  optimal  sequence  estimator  for  convolu¬ 
tionally  coded  asynchronous  CDMA  systems,  which  is  developed  in  [25],  and  the  poten¬ 
tially  poor  performance  of  the  conventional  receiver  due  to  multiuser  interference  and  the 
near-far  problem,  in  this  paper  we  examine  relatively  simple  multiuser  receivers  which 
perform  nearly  as  well  as  the  optimal  receiver.  The  mulduser  receivers  discussed  in  this 
paper  are  of  two  types.  The  first  set  of  approaches  are  partitioned  approaches  that  treat 
the  multiuser  interference  equalization  problem  and  the  decoding  problem  separately. 
The  second  set  of  approaches  are  integrated  approaches  that  perform  both  the  equaliza¬ 
tion  and  decoding  operations  together.  We  study  linear,  decision  feedback  and 
trellis/tree-based  approaches  in  each  category.  The  asymptotic  efficiency  of  this  receiver 
relative  to  an  uncoded  coherent  BPSK  receiver  (termed  asymptotic  multiuser  coding 
gain,  or  AMCG)  is  used  as  a  performance  criterion  throughout  Also,  computer  simula¬ 
tions  are  used  whenever  the  computation  of  the  AMCG  is  not  feasible.  It  is  shown  that  a 
number  of  the  approaches  which  are  introduced  in  this  paper  achieve  a  high  performance 
level  with  a  moderate  complexity. 


1.  Introduction 


There  has  been  a  large  amount  of  interest  recently  in  the  design  of  multiuser  receivers 
for  CDMA  systems.  These  receivers  jointly  estimate  the  transmitted  symbols  of  aU  of  the 
users  in  the  system,  as  opposed  to  estimating  them  independently.  This  approach  is  most 
appropriate  for  a  base  station  in  a  multipoint-to-point  network  where  the  receiver  must 
acquire  and  demodulate  all  of  the  signals  in  the  network.  Almost  all  of  the  multiuser  detec¬ 
tion  work  has  centered  on  uncoded  links,  see  for  example  [1]  -  [8],  [10]  -  [21].  Only  recently 
has  the  problem  of  multiuser  detection  of  coded  links  been  considered,  [22]  -  [26]  and  [30]. 
In  [26],  a  sliding  window  version  of  the  decorrelator,  which  was  introduced  in  [4],  was  intro¬ 
duced  and  the  authors  alluded  to  the  use  of  coding  on  the  link  as  well.  In  [30],  a  partitioned 
soft-decision  trellis-based  approach  was  considered  wherein  the  equalization  and  decoding 
operations  are  performed  separately. 

In  [25],  the  ML  sequence  estimator  was  introduced.  Its  performance  was  significantly 
better  than  the  conventional  receiver’s,  however  its  complexity  was  prohibitively  high. 
Motivated  by  the  need  for  low  complexity  receivers  with  a  performance  level  that  is  com¬ 
mensurate  with  the  optimal  sequence  estimator’s,  we  search  in  this  paper  for  low-complexity 
suboptimal  receivers.  Figure  1  outlines  the  various  approaches  that  will  be  examined  in  this 
paper. 

Through  an  asymptotic  analysis  and  simulation,  it  will  be  shown  that  these  multiuser 
detection  techniques  are  able  to  significantly  improve  the  performance  of  the  conventional 
basestation  architecture.  In  [25],  an  important  performance  measure,  named  the  asymptotic 
multiuser  coding  gain  (AMCX5),  was  introduced.  This  parameter  may  be  defined,  in  general, 
as  the  required  energy  of  a  binary  antipodal  single-user  receiver  which  achieves  the  same 
performance  as  the  multiuser  receiver  (as  the  noise  power  approaches  zero),  divided  by  the 
required  energy  of  a  single  -user  binary  antipodal  receiver  for  an  uncoded  link.  This  param¬ 
eter  reduces  to  the  familiar  asymptotic  multiuser  efficiency  (AME)  parameter  for  the 
uncoded  multiuser  case,  [2],  and  to  the  asymptotic  coding  gain  (ACG)  in  the  single-user 
coded  case.  Several  of  the  decision  feedback  approaches  which  will  be  studied  in  this  paper 
do  not  lend  themselves  to  an  analysis  in  terms  of  AMCG.  As  a  result,  these  approaches  will 
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be  compared  with  the  important  baseline  architectures  via  a  computer  simulation. 

Rather  than  introducing  the  suboptimum  receivers  of  Figure  1  in  the  order  that  they 
appear  in  the  figure,  it  will  be  preferable  to  first  discuss  the  partitioned  approaches,  and  then 
to  discuss  the  combined  equalization  and  decoding  approaches  afterwards.  This  presentation 
will  provide  a  unified  view  of  the  various  approaches. 


2.  Notation 

It  will  be  assumed  that  the  CDMA  system  has  K  users  operating  simultaneously  on  a 
common  frequency  in  an  asynchronous  fashion.  In  general,  it  will  be  assumed  that  each  user 
employs  binary  convolutional  coding  on  its  link.  One  further  assumption  in  this  paper  is  that 
each  user  employs  the  same  convolutional  code,  although  it  is  not  at  all  difficult  to  generalize 
this  work  to  the  case  where  each  user  employs  a  different  code. 

At  each  time  interval  of  length  r, ,  the  convolutional  code  is  generated  for  user  k  by 
passing  P  binary  information  bits,  4(n)  =  (/]t^^(n ),...,  I^\n)),  through  a  shift  register  consist¬ 
ing  of  W  stages  with  Q  modulo-2  adders.  The  number  of  ouq)ut  bits  for  each  P-bit  input 
sequence  is  Q  bits.  The  rate  of  the  code  is  =  P/Q  and  the  constraint  length  of  the  code  is 
W.  The  output  sequence  of  binary  code  bits  for  the  interval  corresponding  to  input  bits  4(n) 
is  (Z)l^)(/i),...,Z)j;p>(n)).  Note  that  for  W  =  1  and  P  =  =  1,  we  have  the  uncoded  case,  so  in 

that  case  Dk(n)  =  4(n). 

In  the  time  interval  [nr,+(q-l)r+tjt,nr,+qr+tjt),  user  k  transmits  data  bit  D^\n), 
where  represents  the  time  shift  of  the  k^  user  relative  to  some  reference  time,  thus 
accounting  for  the  asynchronism  of  the  users  relative  to  each  other.  F  represents  the  code  bit 
period  and  4,  =  T/Rc  is  the  information  bit  duration,  thus  T,  =  QT  =  PTi,.  Let  Xjt  =  m*T+T*, 
Xj^  €  [0,7),  and  mj^  e  {0,...,|2~1}-  Thus  mj^Tis  a  coarse  time  shift  and  Xjt  is  a  fine  time  shift 
for  user  k. 


Each  user  in  the  system  is  assigned  a  particular  signature  sequence,  and  it  will  be 
assumed  that  this  signature  sequence  has  a  duration  equal  to  the  code  bit  interval,  although 
this  assumption  can  be  relaxed  with  a  change  of  the  notation.  We  will  combine  the  carrier 
and  signature  sequence  into  a  single  signal,  thus  the  k^  carrier  multiplied  by  the  binary  (±  1) 
signature  sequence,  PNkit),  will  be  denoted  by 


Skit)  = 


V2/r  PNkit)  cos  (diet) 
0 


Q^t^T 

otherwize 


(1) 


V. 
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The  energy  of  the  user’s  code  bit  measured  at  the  receiver  will  be  denoted  by  £*.  It  will 
be  assumed  that  all  K  users  transmit  their  signals  through  a  common  additive  white  Gaussian 
noise  chaimel  with  two-sided  noise  spectral  density  Nq/2,  and  so  the  received  signal  will 
have  the  following  form 

'•(')=£  i  +  (2) 

n=-~ik=l  g=l 

where  z  (0  denotes  the  noise. 

Next  we  define  the  partial  cross-correlation  of  the  known  signature  sequences  j  and  k  to 
be 

PjkQ)  =  J  dt  .  (3) 

It  is  worth  noting  that  p jf,(0)  =  1  and  p j^Q) = Pkji-l)- 

As  in  the  uncoded  case,  there  are  a  number  of  ways  in  which  a  multiuser  receiver  can 
operate  to  improve  upon  the  performance  of  the  conventional  basestation  which  makes  deci¬ 
sions  on  each  user’s  data  using  only  the  sequence  of  matched  filter  outputs  for  that  user.  In 
the  next  two  sections,  we  will  begin  by  studying  hard-decision  and  soft-decision  partitioned 
multiuser  receivers  which  treat  the  equalization  and  decoding  problems  separately. 

3.  Hard-Decision  Partitioned  Approaches 

The  broad  class  of  multiuser  receiver  architectures  which  treat  equalization  of  the  MUI 
and  decoding  of  the  code  separately  as  shown  in  Figure  2  will  be  referred  to  as  partitioned 
multiuser  receivers.  Within  this  class  of  partitioned  approaches  are  diose  which  use  a  hard- 
decision  multiuser  receiver  to  supply  hard  decisions  from  the  inner  channel  to  a  bank  of 
outer  Viteibi  decoders,  and  those  which  use  soft  decision  multiuser  receivers  to  supply  soft- 
decisions  to  the  outer  decoders.  For  the  hard-decision  case,  sufficient  interleaving  can  pro¬ 
vide  the  outer  decoders  with  statistics  which  can  be  accurately  modeled  as  the  outputs  of  a 
bank  of  K  binary  symmetric  channels.  This  level  of  sufficient  interleaving  will  typically  be 
achieved  with  a  block  interleaver  which  has  a  width  equal  to  the  release  depth  of  the  outer 
Viterbi  algorithms  (roughly  five  times  the  constraint  length,  W),  and  a  depth  greater  than  the 
average  length  of  an  error  event  (which  is  only  a  few  code  symbols  at  high  SNR). 


The  crossover  probability  for  the  user’s  binary  symmetric  channel  may  be  written  as 


Pk^  X 


where  is  the  AME  (introduced  in  [2])  of  the  Jfc**  user’s  multiuser  receiver  which 

operates  on  the  sequence  of  code  symbols  as  though  they  were  uncoded  symbols,  and  is 
the  effective  multiplicity  of  competing  sequences  of  distance  i].  The  first-event  error  proba¬ 
bility  is  given  by 

Pe^iaaP2(d)  (5) 

d=df 

where  aj  is  the  multiplicity  of  paths  with  a  distance  d  from  the  desired  path  and  P2(.d)  is  the 
probability  of  confusing  two  sequences  which  are  d  Hamming  units  apart  The  first  term  of 
this  series  will  dominate  for  low  noise  situations: 

r  r  Tif+i 

as  Aro/2-»0  (6) 


where 


The  leading  term  in  (6)  may  be  upper  bounded  using  the  asymptotically  tight  bound 
Q  (x)  ^  Vi  exp  (rx^/2)  as  follows 

■* 

Ehh 

to  ^ 

Because  this  bound  is  asymptotically  tight,  we  have 

r  n 

MfCCt = iu.rt.  =  -^  + 1  Sc-nKr’  (») 

toL  J  J 

as  long  as  the  interleaving  is  perfect 

This  is  an  important  restilt  because  it  is  a  simple  relation  for  the  AMCG  of  the  overall 
hard-decision  partitioned  multiuser  receiver  in  terms  of  1)  the  code  rate,  2)  the  free  distance 
of  the  code  and  3)  the  AME  of  the  multiuser  receiver  which  is  employed  for  making  code  bit 
decisions.  It  is  not  difficult  to  show  that  the  asymptotic  coding  gain  for  a  hard-decision 


receiver  operating  in  isolation  is 
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ACG=Rc(t-\-l)  =  Rc 


+  1 


(10) 


thus  the  AMCG  is  actually  the  product  of  the  AME  of  the  inner  receiver  and  the  ACG  of  a 
hard-decision  receiver  operating  in  isolation  using  the  same  code.  The  AME  has  been  com¬ 
puted  for  many  multiuser  receivers  on  uncoded  links,  and  so  using  those  results  from  the 
literature,  we  may  easily  compute  the  AMCG  for  the  perfectly  interleaved  hard-decision  par¬ 
titioned  receiver  of  interest 


Using  the  expressions  for  the  AME  of  the  inner  multiuser  receivers  from  [2],  [3],  [4], 
and  [17],  we  may  plot  the  AMCG  for  a  hard-decision  partitioned  receiver  in  a  2-user  system 
with  a  conventional  inner  receiver,  a  decorrelator,  and  a  ML  sequence  estimator,  versus 
for  specific  codes  and  channel  conditions.  In  Figure  3,  the  AMCG  is  plotted 
versus  for  user  one  and  the  case  where  both  users  employ  a  rate  1/2  4-state  code 

with  d/~5,  and  P2i(0)=P2i(l)  =  0.2.  For  this  code,  by  equation  (10)  we  know  that  the 
ACG  for  user  one  operating  in  isolation  is  ACG  =  1.5,  or  10  log(1.5)  dB.  Figure  4  shows  the 
AMCG  versus  near-far  energy  ratio  again  for  the  same  code  with  df-5,  but  this  time  with 
P2i(0)  =  P2i(l)  =  0.3.  These  figures  illustrate  that  as  the  charmel  cross-correlations  become 
higher,  the  achievable  multiuser  coding  gain  for  the  partitioned  receivers  drops.  Figure  5 
shows  the  same  curves  for  the  case  where  the  codes  employed  ate  rate  1/2  64-state  codes 
which  have  a  afy  =  10  and  again  p2i  (0)  =  P2i  (1)  =  0.3.  From  this  figure  and  equation  (9),  it  is 
clear  that  a  stronger  code  is  able  to  improve  the  achievable  multiuser  coding  gain  given  the 
same  channel  conditions,  (compare  with  Figure  4) 

The  complexity  of  the  various  receivers  may  be  measured  with  the  time  complexity  per 
decoded  bit  The  overall  TCB  will  be  the  sum  of  the  TCB  of  the  outer  Viterbi  decoders  and 
Q  times  the  TCB  of  the  inner  multiuser  receiver  since  Q  code  bits  must  be  decided  for  every 
stage  of  the  outer  decoders.  If  we  assume  that  the  code  is  a  rate  P/Q  code  and  has  a  binary 
memory  order  of  k  bits,  then  the  outer  Viterbi  algorithms  will  have  a  TCB  = 

(Note  that  if  code  puncturing  is  used  to  obtain  a  rate  P/Q  code  from  a  rate  1/Q  code,  then  the 
complexity  of  the  outer  Viterbi  algorithms  will  be  TCB  =  0(2*^^^)  since  there  are  2*  states 
and  2  branches  per  state.)  Furthermore,  the  conventional  inner  receiver  will  have 
TCBconv.  =  0{l),  the  MLSE  iimer  receiver  will  have  TCB^ise  =  0(2^)  and  the  J-stage  DFE 
inner  receiver  will  have  roughly  TCBqpe  =  O  (7)  assuming  that  the  complexity  of  one  MUI 


-6- 

calculation  is  roughly  equivalent  to  a  metric  calculation  (which  it  often  is  not).  It  may  be 
more  useful  to  compare  the  rough  number  of  arithmetic  operations  (or  multiplications)  per 
decoded  bit  to  make  a  more  fair  comparison.  For  the  MLSE  this  is  roughly  O  {K  2^),  and  for 
the  J-stage  DFE  this  is  approximately  0(JK).  The  decorrelator  requires  roughly  0(bK) 
operations  per  decoded  bit  if  S  is  the  impulse  response  length  truncation  depth  of  a  decorrela¬ 
tor  which  is  implemented  with  an  FIR  matrix  filter.  It  follows  that  the  overall  number  of 
arithmetic  operations  per  decoded  information  bit  for  the  hard  decision  partitioned  receivers 
is  on  the  order  of  OPconv~00^^^/P)  for  the  conventional  inner  receiver, 
OPmlse  -0([QK2^+2**-'VP)  for  the  MLSE  inner  receiver,  OPjdfe  “Oi[QKJ+2'^*^yP) 
for  the  J-stage  DFE  inner  receiver  and  OP^ec.  =  O  (lQK5+2'^*^yP)  for  the  decorrelator  inner 
receiver. 

4.  Soft-Dedsion  Partitioned  Approaches 

The  computation  of  the  AMCG  for  the  soft-decision  partitioned  approaches  is  mote 
difficult  than  for  the  hard-decision  case.  We  will  have  to  write  expressions  for  the  decision 
statistics  at  the  outer  Viterbi  decoders  for  the  various  inner  multiuser  receivers,  and  then 
upper  bound  the  worst-case  values  of  these  decision  statistics  to  obtain  lower  bound  expres¬ 
sions  for  the  AMCG  of  the  overall  receivers.  It  is  interesting  and  important  to  note  that  the 
conventional  receiver  may  be  viewed  as  a  member  of  the  class  of  soft-decisioned  partitioned 
receivers  with  a  degenerate  multiuser  receiver  which  simply  passes  the  matched  filter  outputs 
to  the  outer  Viterbi  algorithms  without  altering  them.  As  a  result,  by  analyzing  the  multiuser 
receivers  in  this  class,  we  will  also  be  analyzing  the  important  conventional  receiver’s  per¬ 
formance. 

Consider  the  system  shown  in  Figure  2  again.  If  the  deinterleaved  outputs  of  the  mul¬ 
tiuser  receiver  are  now  considered  to  be  soft  outputs  denoted  by  y^^Qi)  for  the  user’s 
code  bit  9  in  the  interval,  then  the  first  question  to  be  asked  is,  "What  is  the  structure  of 
the  optimal  subsequent  decoders?"  If  y^Hn)  is  conditionally  Gaussian,  then  the  appropriate 
decoding  strategy  for  sequences  over  a  decoding  window  n  =  I'o  to  I'o +r  is  to  use  a  Viterbi 
algorithm  with  the  following  correlation  metric 

A(yiklO*)=  §:yi^HnyDi^\n)  (11) 

n^oq^l 

where  %  is  the  deinterleaved  sequence  of  soft  decision  outputs  of  user  k’s  multiuser  receiver, 
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and  Djt  is  the  sequence  of  transmitted  code  symbols  for  user  k.  Note  that  if  we  wish  to  main¬ 
tain  consistency  with  the  "horizon"  used  in  (25],  iq  =  a(-Af)  and  F  =  a(3/)-i o- 

The  notation  used  in  equation  (11)  is  going  to  become  overly  complex  later  and  so  we 
will  simplify  this  equation  by  defining  a  modulo  Q  decomposition  of  an  index,  j,  in  the  same 
way  that  the  modulo  K  decomposition  was  defined  in  [1]  and  [25].  In  this  way,  we  can  write 
(11)  with  a  single  sum  which  accumulates  all  Q  of  the  code  bits  for  each  interval,  n,  for  user 
k. 

_  io^T 

A(y*  !£>*)=  'Lykpkj  (12) 

J=io 

In  this  equation,  y^j  =  y)fe“^^^(P0’))»  ^kj  =  and  j  =  oii)Q + P0‘)”1-  (Note  that  we 

assume  that  iq  =  P(io)  without  a  loss  in  generality) 

The  metric  for  any  valid  competing  sequence  in  the  trellis  will  be 

fo+er 

A(y*II>*  +  2«*)=  2  ykji^kj  -^^^kj]  (13) 

y=io 

where  is  user  l^s  error  sequence,  and  cjy  =  ei“^^HPO))-  If  follows  that  the  two- 
sequence  error  probability  for  sequences  differing  by  ijt  will  be  given  by 

_  _  io  +  QT 

P2(iit)  =  P[A(y*IZ>*  +  2?*)>A(y*ID*)]=P[  X  ykjekj>0}  (14) 

J=io 

where  it  is  assumed  that  the  nonzero  portion  of  the  error  sequence  {etj}  occurs  in  the  region 
io^j  ^io  +  Qr. 

To  proceed,  we  need  to  be  able  to  characterize  yj^j.  To  do  this,  let 

ytj^Dtj^+Ntj  (15) 

where  N^j  is  the  noise  for  user  jb’s  code  bit  afj)  in  die  p(/)^  interval  after  deinterleaving  the 
soft-decision  multiuser  receiver  outputs.  The  characteristics  of  the  noise  will  depend  on  the 
inner  multiuser  receiver  in  use. 

4.1  The  Conventional  Receiver 

With  this  generic  description  of  the  inputs  to  the  outer  Viterbi  decoders,  we  may  now 
consider  a  number  of  special  cases  for  specific  soft-decision  multiuser  inner  receivers.  One 
of  the  most  important  special  cases  of  the  soft-decision  partitioned  receiver  is  the 
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conventional  receiver.  This  receiver  essentially  uses  a  degenerate  multiuser  receiver  which 
simply  passes  the  matched  filter  outputs  on  to  the  outer  Viterbi  algorithms  without  altering 
them  in  any  way,  except  possibly  unscrambling  them  in  a  deinterleaver.  For  the  conven¬ 
tional  receiver,  each  input  to  the  outer  Viterbi  algorithms  corresponds  to  a  desired  part, 
Z>jty  and  a  noise  part,  corresponding  to 

Nkj  =  RMUhj  +  hj  =  MUhj  +  Zkj  (16) 

RMUIkj  denotes  the  residual  MUI,  which  for  the  conventional  receiver  is  equal  to  the  MUI 
on  the  matched  filter  output,  and  Zkj  denotes  the  Gaussian  portion  of  the  noise.  It  is  worth 
noting  that  this  overall  noise  is  not  Gaussian,  and  so  the  use  of  the  correlation  metric  in  the 
outer  Viterbi  decoders  is  not  optimum.  The  noise  is  Gaussian,  conditioned  on  the  signal  plus 
interference;  however  the  outer  Viterbi  algorithms  do  not  condition  on  the  interference  due 
to  the  other  users.  It  is  worth  noting  that  it  is  common  to  appeal  to  the  Central  Limit 
Theorem  to  claim  that  the  noise  in  equation  (16)  is  approximately  Gaussian.  This  leads  to 
the  claim  that  the  correlation  metric  is  appropriate  for  the  outer  decoders.  The  Central  Limit 
Theorem  leads  to  misleading  and  overly  optimistic  conclusions  in  many  cases,  however. 

Because  the  noise  statistic  in  (16)  will  be  Gaussian  conditioned  on  a  given  sequence  of 
the  desired  user  and  the  interferers,  we  could  obtain  a  performance  estimate  by  averaging 
with  respect  to  all  possible  sequences.  This  performance  will  be  asymptotically  determined 
by  the  worst-case  interference  case.  As  a  result,  we  may  bound  the  residual  MUI  by  its 
effective  worst  case  value  for  completely  unconstrained  interferers  to  obtain  a  lower  bound 
on  the  AMCG  of  die  conventional  receiver.  The  worst  case  value  of  RMUIkj  when  the  con¬ 
straints  on  the  other  user’s  transmitted  code  sequences  are  taken  into  account  will  be  no 
greater  than  (and  most  often  lower  than)  the  value  assuming  unconstrained  sequences.  B 
interleaving  is  used  on  the  link,  then  the  interference  patterns  will  be  closer  to  unconstrained 
interference  patterns  since  the  interleaving  will  effectively  break  up  the  code’s  constraints. 
Another  point  worth  noticing  is  that  according  to  (16),  the  noise  on  the  receiver  outputs  is  the 
same  as  the  noise  on  the  matched  filter  outputs,  albeit  potentially  scrambled  from  the  deinter¬ 
leaving  process.  The  noise  sequence  [zkj]  J=i  is  white,  and  the  deinterleaving  will  not  affect 
this. 

With  this  characterization  of  ykj,  we  may  proceed  to  substitute  (15)  and  (16)  into  (14). 

I'o+cr  •o+cr 

Pi(fic)=Pl  I  Z 

y=io 


(17) 


Next  define 
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io4fir 

P=  X  ^kjZkj  (18) 

J=h 

Note  that  whenever  ejy  #  0,  then  e^j  =  —Dj^j  (since  Djy + 2ejy  must  be  a  valid  error  sequence). 
Making  these  substitutions  we  get 

I'o+fir 

PiCfik)  =  -P  [P  >  2  (19) 

y=«o 

Next,  replace  ekj  MUIkj  by  its  largest  possible  value  for  completely  unconstrained  interferers, 

X;  ipto.(i)iV^+ Iip*,(0)iV^+  X  ipto.ewi’yC  (20) 

»»=1  «?et  m=k+l  J 

Thus 

io+cr 

^2(e*)^P[P>  2  (21) 

Mo 

where 

=  X  IPib»(l)lV^+ 2  •Pibn(0)lV^+  2  IP*mH)lVi^  (22) 

m=l  m=A+l 

We  may  next  note  that 

•o+QT 

wt\et]=  X  (^kjf  (23) 

Mo 

so  (21)  becomes 

PSUB2i  Ik)  ^  P[p  >  wr  [iifclY*]  (24) 

Because  p  is  a  linear  combination  of  independent  Gaussian  random  variables  of  zero  mean 
and  variance  No/2,  it  is  not  difficult  to  show  that  £[P]  =  0,  and  £[P^]  =wr[ilt]*No/2.  It 
follows  that 


which  implies  that 

7? 

y\k(ek)  ^  — yvt  [?*]  for  Tjk  ^  0 

^bk 


(26) 
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Since  which  was  defined  in  (22),  may  be  negative,  we  may  generalize  the  bound  in  (26) 
to  be 

nl(ek)'^  max^  { 0,  Y*(wt  }  (27) 

Finally,  since  Eut  =  £*  //?c.  we  note  that 

nimin^  inf  r\i(ek)^max^{0,ykidfRc/Ek)''^}  (28) 

valid 


Stated  in  a  different  form  we  have  the  final  bound 


* 

Efu 

M 

Iffsl 

Ek 

K  J 

Ek 

V4 


-  Z  iPhn(-l)\ 


-it 

V  ^ 


1) 


For  the  2-user,  Rc  =  Vi  case,  this  bound  takes  the  form 


0 


if  1/C 

otherwise 


(30) 


(recall  that  Ip2i(0)i  +  Ip2i(l)l)  This  lower  bound  on  the  AMCG  for  the  conventional 
receiver  is  potentially  loose  if  the  coding  imposes  severe  restrictions  on  the  allowable 
interfering  sequences.  This  is  due  to  the  fact  that  the  worst-case  allowable  interfering 
sequence  may  be  much  less  severe  than  the  unconstrained  worst  case  sequence.  Nonetheless, 
with  good  interleaving,  we  believe  that  it  will  be  a  reasonable  approximation. 

The  bound  in  (30)  is  plotted  in  Figures  6  and  7  for  the  ^  =  0.4  and  0.6  cases  respec¬ 
tively.  From  these  figures,  we  see  that  as  the  energy  of  the  interferer  grows,  the  AMCG  of 
user  I’s  conventional  receiver  drops  to  zero.  This  zero  AMCG  typically  implies  that  the 
receiver  will  have  a  performance  floor. 

The  TCB  of  the  conventional  receiver  with  soft-decisions  will  be  the  same  as  those  of 
the  partitioned  hard-decision  receiver  with  a  conventional  inner  receiver. 


4.2  Soft-Decision  Partitioned  Receiver  With  A  Linear  Multiuser  Receiver 

Another  interesting  class  of  partitioned  receivers  is  those  with  a  linear  inner  multiuser 
receiver,  [26].  The  most  well  known  members  of  the  class  of  linear  multiuser  receivers  are 
the  decorrelator,  [4]  and  the  minimum  mean  squared  error  (MMSE)  receivers  [7]  and  [14], 


-11- 

In  this  section,  we  will  focus  on  the  decorrelator  from  [4]  as  a  representative  of  this  class 
because  it  leads  to  a  tractable  analysis.  This  multiuser  receiver  has  the  property  that 
RMUIkj=0  at  the  decorrelator  output,  but  the  variance  of  Zkj  will  generally  be  larger  than 
that  of  Zkj  due  to  this  receiver’s  noise  enhancement  property  (see  equation  (16)).  For  this 
receiver,  we  may  write  the  two-sequence  error  probability  as 


‘o-Kjr 

P2(fik)-P[  2  ^kj(PkJ^-^Zkj)>0] 

Ho 

lo+fir  ^  io+cr 

2  ^kj^kj'^~  2  ^kj^kj^^k  ] 

J^o  J—io 


(31) 


(32) 


If  we  next  notice  that  =  -etj  whenever  etj  0,  use  (23),  and  redefine  p  for  this  section  in 

the  same  fashion  as  in  (18)  to  now  be 


io+QT 

P  ~  X)  ^kj  » 

Ho 


(33) 


then  we  rewrite  (32)  as 


P2(ek)  =  /*[P  >  ] .  (34) 

P  is  a  linear  combination  of  Zkj's  which  are  each,  in  turn,  a  linear  combination  of  the  matched 
filter  output  noises,  {zkj).  It  is  easy  to  show  that  £[P]  =  0.  The  computation  of  the  second 
moment  of  P  requires  more  work,  however. 

,  ^o+cr  io+er 

£[P^]=£[  X  ^kjhj  X  ^kpZkp\  (35) 

Ho  P=io 

<o-K2r/o+er 

“X#  Xrf  ^kJ^lq)^\.ZkjZiqf^  (36) 

J=io  p-io 

Next,  define 


E\XkjZkp]  = 


^kk(p-j) 


Using  this  nomenclature. 


we  may  proceed  to  rewrite  (34)  as 


P2(ek)  =  Q 


E[^^] 


(37) 


(38) 
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=  fi 


RcWt{ek]^ 

'ZJjekjetp^ut(p-j) 
J  P 


(39) 


so  it  is  easy  to  see  from  (39)  that 


RcWt[ekf 

^^kj^kp  ^kk(p  ~j) 
J  P 


(40) 


_ Rc^t{ekf _ 

io+er  QT 

WtI?jt]^jfcfe(0)  +  2  2  X^kj^kj-l^Ucif) 
j=io  l=l 


(41) 


Now,  all  that  remains  to  be  done  to  obtain  numerical  results  is  to  evaluate  We  will 

do  this  for  the  2-user  case.  If  p(z)  denotes  the  multiuser  system  channel  transfer  function 
matrix,  and  so  p“Hz)  denotes  the  decorrelator’s  transfer  function  matrix,  then  for  the  2-user 
case  we  have,  [4] 


P  ^(z)  = 


1 


l-i>12“P21’i>12p21^‘“Pl2P2l2 


-1 


1  “(“Pl2+P2lZ”^) 

-(P12+P21Z)  1 


(42) 


so 

p;j(z)=  ,  ,  , - - - IT  fort€(l,2)  (43) 

1-p  12-P21”P12P21Z“P12P21Z 

Taking  the  inverse  Z-transform  of  this  polynomial,  we  obtain 

(2pi2P2l)'*  ’  V[l“(Pl2+P2l)^][l“(Pl2“P2l)^] 

«r 

where  p^  =  P2i(0),  and  P21  =  P2i(l)»  [4]. 

We  may  obtain  lower  bound  on  the  AMCG  via  the  following  procedure.  Note  that 
^kkiO  has  the  property  that  ^>**(0)  >  ^**(1)  >  ^**(2)  >•••  so  the  second  term  in  the  denomi¬ 
nator  of  (41)  may  be  upper  bounded  as  follows 

«o+cror  _ 

2  I  2  *«(/)(>« [e,H)]  (45) 

J=io  /=!  /=! 


and  this  allows  us  to  bound  the  expression  in  (41)  by 
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w^[«*]^*tt(0)  +  2  2  [?*]-/) 

1=1 


=  ^(^c.w'/[i*],pi2,P2i)  (46) 


This  is  a  bound  on  !!*(?*)  which  is  only  a  function  of  wt{ek\  rather  than  the  actual  error 
sequence  ?*.  Using  the  fact  that  this  expression  is  monotonically  increasing  with  we 

can  obtain  a  lower  bound  on  the  AMCG  for  this  partitioned  receiver: 


y\t„un^  min  ^(l?c,Wf[«ifc],pi2.p2l) 


Red} 


d/l>kki0)+2^^kkmdrl) 

i=i 


(47) 


As  an  example,  for  the  case  where  P2i(l)  =  p2i(0)  =  0.3  we  get  qf  ^  1.67  (see  Figure  7). 
The  result  for  p2i  (0)  =  p2i  (1)  =  0.2  is  plotted  in  Figure  6. 

Another  interesting  result  of  (47)  is  that  it  provides  a  tightening  of  the  lower  bound  on 
the  AMCG  of  the  MLSE  of  [25].  This  is  due  to  the  fact  that  the  MLSE,  which  is  the  optimal 
sequence  estimator,  will  have  a  higher  AMCG  than  the  partitioned  soft-decision  decorrclator, 
which  is  a  suboptimal  sequence  estimator.  This  tightening  of  the  bound  on  the  MLSE’s 
AMCG  is  incorporated  into  Hgures  6  and  7. 

An  upper  bound  on  the  AMCG  for  the  soft-decision  partitioned  decorrelator  may  be 
obtained  by  performing  a  non-exhaustive  search  over  the  set  of  possible  valid  ^  sequences. 
One  valid  error  vector  for  the  standard  four-state  rate  1/2  code  is  the  following,  e*  = 
(110111).  This  error  vector  gives  the  smallest  result  of  equation  (41)  of  Aose  tested. 
Because  Ae  actual  minimum  of  equation  (41)  over  Ae  set  of  all  valid  error  sequences  must 
be  no  larger  Aan  Ae  minimum  of  (41)  over  Ae  small  set  of  sequences  Aat  we  tested,  we 
have  an  upper  bound  on  Ae  AMCG.  This  upper  bound  turns  out  to  be  quite  close  to  Ae 
lower  bound  we  have  already  obtained,  so  we  have  a  very  accurate  picture  of  Ae  partitioned 
decorrelator’s  performance.  This  upper  bound  is  also  plotted  m  Rgures  6  and  7. 

The  complexity  of  Ae  soft-decision  and  hard-decision  partitioned  receivers  wiA  a 
decorrelating  inner  receiver  will  be  Ae  same.  This  is  a  result  of  Ae  fact  Aat  Ae  only  differ¬ 
ence  between  Ae  two  is  Aat  Ae  soft-decision  version  does  not  make  a  hard  decision  on  Ae 
decision  stotistics  before  passing  Aem  to  Ae  outer  Viterbi  decoders,  and  Ae  outer  decoder 
metrics  will  differ  but  be  of  Ae  same  complexity  order. 
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4.3  Soft-Dedsion  Partitioned  Receiver  with  a  Trellis-Based  or  Tree-Based  Inner 
Receiver 

A  third  approach  which  potentially  will  have  the  best  performance  of  any  partitioned 
approach  is  the  one  which  uses  a  soft-decision  trellis-based  or  tree-based  receiver  as  the 
inner  multiuser  receiver.  There  are  a  number  of  possible  trellis-based  receivers,  the  most 
important  of  which  are  the  ML  sequence  estimator,  [1],  and  the  reduced  state  sequence  esti¬ 
mator  (RSSE),  [6],  [31].  As  an  example  of  a  tree-based  receiver,  see  [8].  In  their  standard 
form,  each  of  these  receivers  output  hard-decisions,  and  so  techniques  such  as  those  of  [29] 
must  be  applied  to  allow  the  inner  receiver  to  supply  soft  outputs  to  the  outer  Viterbi 
decoders. 

This  approach  has  recently  been  proposed  by  two  research  groups,  [9],  and  [30].  Both 
groups  cite  the  prohibitive  complexity  of  the  full  ML  sequence  estimator,  and  examine  RSSE 
and  sequential  decoding  alternatives. 

The  computation  of  the  AMCG  for  these  approaches  remains  an  open  problem.  The 
AMCG  of  the  soft-decision  partitioned  receiver  with  an  ML  sequence  estimator  should  have 
a  higher  AMCG  than  any  other  partitioned  receiver,  since  the  ML  sequence  estimator  is  the 
optimum  inner  receiver  in  the  sequence  error  probability  sense.  It  follows  that  RSSE  and 
sequential  decoding  approaches  which  do  not  suffer  significantly  in  performance  relative  to 
the  MLSE  will  also  have  a  high  AMCG.  The  interested  reader  is  referred  to  [9]  and  [30]  for 
more  detail  on  these  particular  approaches. 

4.4  Soft-Dedsion  Partitioned  Receiver  With  a  DFE  Inner  Receiver 

In  this  approach,  a  multistage  DFE  operates  on  the  set  of  matched  filter  outputs,  by 
making  tentative  decisions  and  feeding  these  decisions  back  to  make  estimates  of  the  MUI 
which  will  be  subtracted  from  other  matched  filter  outputs.  The  key  in  the  soft  code  symbol 
DFE  (SCS-DFE)  approach  is  that  at  the  final  stage,  the  MUI  estimate  will  again  be  sub¬ 
tracted  from  the  delayed  matched  filter  output,  but  no  hard-decision  making  will  be  per¬ 
formed.  Instead,  the  modified  matched  filter  output  will  be  passed  straight  to  the  Viterbi 
decoder. 

At  this  point,  we  have  not  committed  to  a  particular  type  of  multistage  DFE.  As  dis¬ 
cussed  in  [21],  there  have  been  at  least  six  architectures  proposed  in  the  literature  for 
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asynchronous  CDMA  links,  [5],  [7],  [12  -  14],  [19  -  21]  and  [27],  and  there  has  also  been 
some  work  on  improving  the  decision  making  procedure  of  the  algorithms,  [16]. 

The  performance  of  this  class  of  approaches  is  difScult  to  evaluate  analytically,  due  to 
the  presence  of  error  propagation.  Expressions  for  the  AME  of  the  Varanasi  DFE  have 
recently  been  reported  in  [17],  but  the  approaches  used  to  get  those  AME  expressions  do  not 
easily  generalize  to  the  coded  link  case.  It  is  possible  to  evaluate  the  AMCG  under  the 
assumption  of  correct  feedback.  However  this  implies  that  as  long  as  the  correlation  parame¬ 
ters  and  energies  have  been  perfectly  estimated,  RMUI  =  0.  Since  there  is  no  residual 
interference  if  the  feedback  is  correct,  and  there  is  no  noise  enhancement,  as  in  the  case  of 
the  decorrelator,  we  obtain  the  result  that  the  AMCG  is  that  of  a  single-user  system.  Clearly, 
the  presence  of  error  propagation  degrades  the  AMCG  by  some  amount,  so  the  computation 
of  the  actual  AMCG  remains  an  open  question.  Consequently,  simulation  will  be  the  perfor¬ 
mance  evaluation  technique  for  this  class  of  receiver. 

Because  the  structure  of  the  integrated  DFE  which  will  be  discussed  in  section  5.3  is 
most  like  the  Varanasi  style  uncoded  link  multistage  decoder,  it  is  interesting  to  compare  the 
integrated  DFE  with  the  Varanasi  style  SCS-DFE.  In  addition,  because  it  was  shown  in  [20  - 
21]  that  in  most  cases,  the  Hybrid  DFE  outperforms  the  other  two  architectures  on  an 
rmcoded  link,  it  is  an  obvious  candidate  for  use  in  an  SCS-DFE  structure.  Thus,  the  struc¬ 
tures  that  were  simulated  were  the  Hybrid  and  Varanasi  versions  of  the  SCS-DFE.  The 
modifications  to  the  decision  making  devices  in  each  preliminary  stage  of  the  multistage 
decoders  discussed  in  [16]  were  not  considered  here,  although  those  modifications  may  pro¬ 
vide  improvements  in  some  cases. 

Figure  9  shows  the  performance  curves  for  a  four-user  "0.2  channel".  As  this  figure 
illustrates,  the  conventional  decoder  suffers  about  a  3  dB  loss  at  Pb  oarage  =  2-10"^  relative  to 
the  performance  of  the  same  receiver  operating  in  the  absence  of  MUL  For  this  case,  the 
Hybrid  version  of  the  SCS-DFE  outperforms  the  Varanasi  version  of  the  SCS-DFE.  This  is 
similar  to  the  results  obtained  on  the  uncoded  link  simulated  in  [20  -  21]. 

Figure  10  shows  the  performance  for  a  more  severe  channel.  In  this  figure,  it  is  evident 
that  all  of  the  decoders  perform  significantly  worse  than  in  the  "0,2  chaimel"  due  to  the  more 
severe  MUI.  In  addition,  the  hybrid  SCS-DFE  again  outperforms  the  Varanasi  DFE. 

Once  again,  the  complexity  of  the  soft-decision  partitioned  DFE  receiver  will  be  of  the 
same  order  as  in  the  hard-decision  case. 
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5.  Combined  Equalization  and  Decoding  Approaches 

In  all  of  the  partitioned  approaches,  regardless  of  the  type,  the  multiuser  receiver 
operates  at  the  code  symbol  level  as  though  there  were  no  coding  on  the  link,  and  then  passes 
its  decisions  or  improved  statistics  to  an  outer  decoder.  The  deficiency  with  this  approach  is 
separating  the  functions  of  cancelling  the  MUI  and  decoding  the  message  does  not  take 
full  advantage  of  the  coding  on  the  link.  The  approaches  discussed  in  this  section  attempt  to 
alleviate  this  shortcoming. 

5.1.  Trellis-Based  and  Tree-Based  Combined  Equalization  and  Decoding  Approaches 
The  MLSE  of  [25]  is  a  trellis-based  approach  which  combines  the  functions  of  equali¬ 
zation  and  decoding  into  one  operation.  This  approach  is  the  optimal  sequence  estimator. 
The  MLSE  has  a  prohibitively  high  number  of  states,  thus  it  is  natural  to  look  to  simplify  the 
decoding  process  by  combining  states  in  some  fashion.  This  approach  of  combining  states  in 
some  fashion  is  what  is  referred  to  as  reduced  state  sequence  estimation  (RSSE).  There  are  a 
number  of  publications  on  the  application  of  RSSE  to  simplify  the  MLSE  for  the  uncoded 
case,  [6],  [31],  as  well  as  at  least  one  other  on  the  application  of  RSSE  to  simplify  the  pro¬ 
cess  of  equalizing  and  decoding  a  coded  signal  on  a  single-user  dispersive  link,  [28].  It  is 
undoubtedly  possible  to  apply  these  techniques  to  the  problem  at  hand  to  obtain  a  perfor¬ 
mance  versus  complexity  tradeoff. 

Just  as  in  [8],  it  is  also  presumably  possible  to  use  sequential  decoding  approaches  to 
lower  the  complexity  of  the  MLSE  for  tiie  coded  link  case.  Sequential  decoding  would  also 
provide  the  opportunity  to  tradeoff  performance  versus  complexity. 

One  problem  with  the  trellis-based  approaches,  however,  is  that  they  may  not  be  as 
robust  to  mismatch  (a  misestimation  of  die  correlation  or  energy  parameters)  as  some  of  the 
simpler  approaches  like  the  partitioned  decorrelator  and  DFE  approaches.  In  [15],  it  was 
shown  that  the  uncoded  link  MLSE  was  not  as  robust  as  a  Varanasi  DFE  to  mismatch,  in  the 
sense  that  for  even  small  values  of  mismatch,  the  suboptimum  DFE  ouqierformed  the  MLSE. 
This  high  sensitivity  of  the  optimal  approach  to  mismatch  is  a  very  undesirable  feature,  and 
it  may  very  well  carry  over  to  suboptimal  approaches  which  are  based  on  the  MLSE  like  the 
RSSE  and  sequential  approaches.  Nonetheless,  given  no  mismatch,  the  trellis-based  and 
tree-based  approaches  have  the  potential  to  perform  nearly  as  well  as  the  MLSE,  possibly 
with  a  significantly  decreased  complexity. 
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5.2  The  Decision  Feedback  Combined  Equalization  and  Decoding  Approach:  The 
Integrated  DFE 

The  idea  in  the  approach  we  will  refer  to  as  the  integrated  DFE  is  that  the  MUI  can  be 
more  reliably  estimated  by  exploiting  the  coding.  Thus,  instead  of  using  a  hard-decision  dev¬ 
ice  in  the  first  stage  of  a  multistage  DFE,  like  that  of  [5],  we  may  decode  the  message  using  a 
soft-decision  Viterbi  algorithm  which  operates  on  the  stream  of  matched  filter  outputs  in  the 
channel,  and  then  re-encodes  the  decoded  bits  to  form  estimates  of  the  code  bits.  These 
code  bit  estimates  can  then  be  used  to  estimate  the  MUI  in  other  user’s  channels.  Again,  as 
in  the  uncoded  case,  the  decision  feedback  may  be  performed  for  as  many  stages  as  is 
desired  (see  Figure  8)  The  performance  of  this  approach  is  difficult  to  evaluate  analytically, 
again  due  to  the  presence  of  incorrect  feedback.  As  a  result,  simulation  will  be  the  perfor¬ 
mance  evaluation  technique  in  this  section. 

A  characteristic  of  convolutional  codes,  or  most  codes  for  that  matter,  is  that  at  very 
low  signal  to  noise  ratios,  the  coded  link  may  perform  worse  than  an  uncoded  link.  When 
the  signal  to  noise  ratio  is  in  this  regime,  it  is  possible  that  the  Viterbi  decoder  whose  outputs 
are  re-encoded  to  form  the  ML  estimate  of  the  code  bit  sequence,  may  perform  worse  than  a 
simple  hard-decision  device  operating  on  the  code  bits  without  regard  to  the  coding.  As  a 
result  of  this  characteristic,  it  is  important  that  the  combination  of  the  thermal  noise  and  MUI 
is  not  so  strong  that  the  re-encoded  Viterbi  output  sequence  is  worse  than  the  estimated  code 
bits  of  a  simple  threshold  detector  for  the  integrated  DFE  to  outperform  an  SCS-DFE.  Basi¬ 
cally,  the  structure  which  provides  better  estimates  of  the  code  bit  sequence  will  have  a 
better  estimate  of  the  MUI  in  the  other  channel’s  multistage  decoders.  In  general,  because 
coding  generally  allows  better  estimates  of  the  transmitted  sequence,  it  is  reasonable  to 
expect  the  integrated  DFE  to  outperform  an  SCS-DFE  of  a  similar  architecture  like  a  SCS- 
DFE  with  a  Varanasi  DFE. 

Figure  9  shows  performance  of  the  integrated  DFE  and  the  various  SCS-DFE 
approaches  on  the  "0.2  channel".  In  this  environment,  the  integrated  DFE  is  able  to  nearly 
recoup  all  of  this  loss,  while  the  various  SCS-DFEs  are  able  to  only  recoup  some  of  the  loss. 

It  may  thus  be  concluded  that  the  thermal  noise  and  MUI  are  weak  enough  to  be  operating  in 
the  regime  where  a  receiver  which  exploits  the  coding  performs  better  than  one  which  does 


not. 
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Figure  10  shows  the  perfonnance  on  a  more  severe  channel.  The  integrated  DFE  still 
uniformly  outperforms  the  Varanasi  version  of  the  SCS-DFE.  For  these  particular  channel 
characteristics,  however,  the  hybrid  SCS-DFE  is  able  to  outperform  the  integrated  DFE  at 
larger  values  of  Eb/N^.  While  this  may  seem  surprising  at  first,  it  is  simply  due  to  the  fact 
that  even  though  the  hybrid  SCS-DFE  performs  separate  equalization  and  decoding,  it  has  a 
high  quality  first  stage  which  is  able  to  provide  better  code  symbol  estimates  to  the  second 
stage  MUI  estimator  than  the  conventional  Viterbi  algorithm  operating  in  the  first  stage  of 
the  integrated  DFE.  This  case  illustrates  that  when  the  MUI  is  strong  enough,  the  integrated 
DFE  will  not  always  ouq)erform  a  well  designed  SCS-DFE,  although  it  does  in  most  cases. 

To  compute  the  TCB  for  the  integrated  DFE,  we  again  assume  that  the  computation  of 
the  MUI  in  each  stage  of  the  DFE  structures  is  roughly  equivalent  in  complexity  to  the  com¬ 
putation  of  one  metric  in  the  Viterbi  decoder.  Thus  adopting  this  convention,  we  may  con¬ 
clude  that  for  a  link  with  rate  1/Q  and  constraint  length  W  codes,  the  /-stage  integrated  DFE 
has  a  time  complexity  of  roughly  TCB  =  O  ((/-1)2  This  is  significandy  higher  than 

that  of  the  SCS-DFE,  although  it  is  far  less  than  that  of  the  MLSE  of  [25].  In  the  general  rate 
P/Q  code  case,  if  again,  k  s  log2iS  where  5  is  the  number  of  states  in  each  user’s  encoder,  the 
integrated  DFE  has  TCB  =  O([(/-l)0  -l-/2*^^]/P).  Again,  because  each  MUI  computation 
grows  in  complexity  with  K,  we  may  say  that  the  number  of  arithmetic  operations  per 
decoded  bit  is  on  the  order  of  OPjdfe^  This  is  of  about  the 

same  order  as  for  the  partitioned  SCS-DFE  approaches. 

We  may  thus  conclude  that  the  integrated  DFE  has  a  larger  TCB  than  any  of  the  SCS- 
DFE  approaches,  but  its  complexity  as  measured  in  terms  of  the  number  of  operations 
required  per  decoded  bit  is  not  necessarily  higher  and  its  performance  is  better  in  most  cases. 
As  a  result,  the  integrated  DFE  is  an  attractive  approach.  As  we  have  seen  in  Figure  10, 
however,  even  the  integrated  DFE  does  not  perform  well  when  the  MUI  becomes  too  strong. 

6.  Conclusions 

In  this  paper,  a  large  number  of  approaches  have  been  discussed  (see  Figure  1).  Ihe 
approaches  can  be  categorized  as  linear,  DFE  and  trellis-based,  as  in  Bgure  1,  or  they  may 
be  categorized  as  partitioned  and  combined  approaches  as  they  were  presented  in  this  paper. 

Throughout  this  paper,  the  number  of  arithmetic  operations  per  decoded  bit  was  used  as 
a  measure  of  the  complexity  of  the  receivers.  It  is  clear  that  for  a  large  number  of  users  and 
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large  codes,  the  MLSE  and  Partitioned  MLSE  are  too  complex  to  be  of  use.  The  conven¬ 
tional  approach  has  the  lowest  complexity  and  all  of  the  other  ^proaches  have  an  intermedi¬ 
ate  complexity  (linear  with  K). 

Throughout  this  paper,  the  various  approaches  were  compared  using  the  AMCG  perfor¬ 
mance  measure  whenever  possible.  When  the  receivers  did  not  lend  themselves  to  an  AMCG 
analysis,  as  in  the  case  of  the  DFE’s,  computer  simulations  were  used  to  compare  their  per¬ 
formance  to  the  important  baselines,  the  single-user  bound,  the  MLSE’s  performance  and  the 
conventional  receiver’s  performance. 

An  examination  of  the  tigures  of  this  paper  illustrate  that  the  MLSE  of  [25]  has  the  best 
performance.  The  worst  performance  of  any  receiver  considered  was  the  hard-decision  con¬ 
ventional  followed  by  the  soft-decision  conventional.  The  soft-decision  partitioned 
approaches  with  a  decorrelator  or  DFE  timer  receiver  provide  reasonable  performance  and 
have  a  fairly  low  complexity.  Also,  the  integrated  DFE  provided  a  good  compromise  of  per¬ 
formance  and  complexity  in  the  situations  considered. 
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Demodulators  for  CDMA  systems  with  convolutional  codes 


With  Decoding 
Only  (no  Equalization) 
’Conventional’ 


With  Equidization 
And  Decoding 


Linear  Equalization  Decision  Feedback  Trellis  Based 


Figure  1:  Tree  diagram  of  the  possible  receiver  structures  for  CDMA  systems  operating  with  convolutional  codes 
All  of  the  partitioned  s^qtroaches  (sq)arate  equalization  and  decoding)  can  be  impiementafi  in  a  hard  or 
soft  decision  form. 

(The  approaches  in  boxes  will  be  discussed  in  this  paper) 


Parallel  Binary  Symmetric  Channels  (Hard-Decision  Case) 


Figure  2  CDMAlink  with  a  partitioned  receiver.  If  die  interleaving  can  successfully 
break  up  die  inner  diannel’s  memory,  then  die  link  widiin  the  dotted  box 
may  be  accurately  modeled  as  K  parallel  BSCs  in  the  hard-decision 

multiuso:  receivo'  case. 

(n  s:  n  if  there  is  no  interleaving  on  the  link) 


Partitioned  Hard-Decision  MLSE  w/  Perfect  Interleaving 


AMCG 


Rgure  3:  Plot  of  AMCG  for  the  2-user,  p2i(0)  =  p2i  (1)  =  0.2  case  where  both  users  employ  a  rate  1/2  4- 
state  code  with  d/  =  5.  The  ACG  for  a  single-user  system  using  this  code  is  10  log  (1.5)  dB  for  a  hard- 
decision  decoder. 
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Figure  4;  Plot  of  AMCG  for  the  2-user,  p2i  (0)  =  p2i  (1)  =  0.3  case  where  botii  users  employ  a  rate  1/2  4- 
state  code  with  dj  -  5.  The  ACG  for  a  single-user  system  using  this  code  is  10  log  (1.5)  dB  for  a  hard- 
decision  decodo'. 


Figure  5:  Plot  of  AMCG  for  the  2-user,  p2i  (0)  —  P21  (1)  =  0.3  case  where  both  users  employ  a  rate  1/2 
64-state  code  with  d/  =  10.  The  ACG  for  a  single-user  system  using  this  code  is  10  log  (2.5)  dB  for  a 
hard-decision  decoder. 


AMCGx 


Rgure  6:  Plot  of  AMCG  for  the  2-user,  p2i  (0)  =  pzi  (1)  =  0.2  case  where  both  users  employ  a  rate  1/2  4- 
state  code  with  df  =  5.  The  ACG  for  a  single-user  system  using  this  code  is  10  log  (1.5)  dB  for  a  hard- 
decision  decode  and  10  log  (2.5)  dB  fw  a  soft-decision  decode:.  Lower  bounds  are  shown  as  solid  lines, 
upper  bounds  as  dashed  lines,  and  the  partitioned  hard-decision  approaches  are  shown  as  dotted  lines  for 
comparison  (iGrom  Figure  3). 


Rgure  7:  Plot  of  AMCG  for  the  2-user,  p2i(0)  =  p2t  (1)  =  0.3  case  where  both  users  employ  a  rate  1/2  4- 
state  code  with  rf/  =  5.  The  ACG  for  a  single-user  system  using  this  code  is  10  log  (1.5)  dB  for  a  hard- 
decision  decoder  and  10  log  (2.5)  dB  f<v  a  soft-decision  decoder.  Lower  boimds  are  shown  as  solid  lines, 
ui^r  bounds  as  dashed  lines,  and  the  partitioned  hard-decision  approaches  are  shown  as  dotted  lines  for 
comparison  (from  Figure  4). 


Hgure  8:  The  structure  of  a  3-stage  integrated  DFE  for  flie  user  with  Viterbi  algorithms  (denoted  VA) 
at  each  stage.  A  is  die  maximum  delay  in  code  bit  periods  corresponding  to  5  information  bit  periods 


Rgure  9:  Performance  curves  of  the  various  decision  feedback  receivers  for  a  4-user  rhatinpi  with 
Pjk(l)  =  0.2  for  an  overlapping  bits.  The  solid  lines  show  a  single  user  system  (no  MUI)  with  and  without 
the  rate- 1/2  4-state  convolutional  code  and  a  multiuser  conventional  recover.  Also  shown  are  the  one, 
two  and  three  stage  soft  code  symbol  DFEs  for  both  the  Varanasi  (dashed)  and  Hybrid  (dotted)  architec¬ 
tures,  and  a  one,  two  and  three  stage  integrated  DFE  (dashed  lines).  Note  that  the  Varanasi  style  one- 
stage  soft  code  symbol  DFE  and  the  one-stage  integrated  DFE  are  both  equivalent  to  the  conventional 
receiver. 
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Figure  10;  Pofonnance  curves  of  the  various  decision  feedback  receivers  for  a  more  sevore  4-user  chan¬ 
nel.  (See  correlation  matrices  below)  The  solid  lines  show  a  single  user  systmn  (no  MUI)  with  and 
without  the  rate- 1/2  4-state  convolutiomd  code  and  a  multiuser  conventional  recdver.  Also  shown  are  the 
one,  two  and  three  stage  soft  code  symbol  DFEs  of  bodi  the  Varanasi  (dashed)  and  Hybrid  (dotted)  artdii- 
tectures,  and  a  one,  two  and  three  stage  integrated  DFE  (dashed  lines).  Note  fiiat  the  Yaraiiasi  style  one- 
stage  soft  code  symbol  DFE  and  the  one-stage  integral^  DFE  are  both  equivalent  to  the  conventional 
receiver. 
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Abstract 


The  optimal  multiuser  sequence  estimator  is  formulated  for  an  asynchronous 
direct-sequence  CDMA  system  where  each  user  employs  convolutional  codmg  to 
improve  its  performance  on  a  non-dispersive  AWGN  channel.  It  is  shown  that  the 
decoder  may  be  implemented  efficiently  using  a  Viterbi  algorithm  which  operates  on  a 
time-varying  trellis  with  a  number  of  states  which  is  exponential  in  the  product  of  the 
number  of  users  in  the  system  and  the  constraint  length  of  the  codes  used  (for  the  rate  1/2 
code  case).  The  asymptotic  efficiency  of  this  receiver  relative  to  an  uncoded  coherent 
BPSK  receiver  (termed  asymptotic  multiuser  coding  gain,  or  AMCG)  is  then  upper  and 
lower  bounded.  The  AMCG  parameter  unifies  the  asymptotic  coding  gain  parameter  and 
the  asymptotic  multiuser  efficiency  parameter  which  are  traditional  figure  of  ment 
parameters  for  single-user  coded  systems  and  multiuser  uncoded  systems  respectively. 
Finally,  some  simulations  are  presented  to  illustrate  the  performance  of  the  MLSE  at 
moderate  and  low  bit  error  rates. 


1.  Introduction 


In  code  division  multiple  access  (CDMA)  systems,  multiple  users  transmit,  often  asyn¬ 
chronously,  over  a  common  communication  channel,  typically  using  the  direct  sequence 
spread  spectrum  technique.  The  receiver  operating  in  the  AWGN  environment  receives  a 
signal  which  is  the  sum  of  all  of  the  transmitted  signals  in  noise,  and  the  receiver  must  syn¬ 
chronize  to  the  desired  signal  and  estimate  the  desired  user’s  transmitted  Often,  in  an 

attempt  to  improve  performance,  error  control  coding  will  be  used  on  each  of  the  links  as 
well. 

The  traditional  method  of  coherently  demodulating  direct  sequence  CDMA  signals  is  to 
synchronize  a  local  code  generator  and  oscillator  to  the  signal  of  interest  and  then  to  make 
decisions  on  the  received  signal  as  though  the  desired  signal  is  the  only  one  present.  Tlie 
traditional  decoder’s  structure  is  that  of  a  correlator  or  matched  filter  which  is  matched  to  the 
desired  signal,  followed  by  a  decoder  if  coding  is  used  on.  the  link.  The  performance  of  the 
traditional  decoder  suffers  for  two  major  reasons.  First,  the  signature  sequences  of  the  dif¬ 
ferent  users  will  not  be  orthogonal  to  each  other,  giving  rise  to  multi-user  interference,  or 
MUI,  and  second,  in  the  common  situation  where  all  of  the  signals  arriving  at  the  receiver 
are  of  different  strengths  the  strong  signals  tend  to  overwhelm  the  weak  signals,  even  with 
reasonably  good  signature  sequences.  This  second  problem  is  referred  to  as  the  near -far 
problem. 

There  has  been  a  large  amount  of  interest  recently  in  the  design  of  multiuser  receivers 
for  CDMA  systems.  These  receivers  jointly  estimate  the  transmitted  symbols  of  all  of  the 
users  in  the  system,  as  opposed  to  estimating  them  independently.  This  approach  is  most 
appropriate  for  a  base  station  in  a  multipoint-to-point  network  where  the  receiver  must 
acquire  and  demodulate  all  of  the  signals  in  the  network.  Almost  all  of  the  multiuser  detec¬ 
tion  work  has  centered  on  uncoded  links,  see  for  example  [1]  -  [5].  Only  recently  has  the 
problem  of  multiuser  detection  of  coded  links  been  considered,  [6],  [1 1]  -  [13]  and  [15]. 

In  this  paper,  the  multiuser  maximum  likelihood  receiver  will  be  formulated  for  convo- 
lutionally  coded  nondispersive  AWGN  links.  We  will  see  that  this  receiver  performs  both 


the  functions  of  equalization  of  the  MUI  and  decoding  of  the  code  together.  Section  2  of  this 
paper  wiU  oudine  some  of  the  notation  that  will  be  used,  and  will  also  define  the  problem  in 
more  precise  terms.  In  section  3,  we  formulate  the  ML  receiver  using  the  rate  1/2  code  case 
as  an  example,  and  then  analyze  its  performance. 

2.  Notation 

It  win  be  assvuned  that  the  CDMA  system  has  K  users  operating  simultaneously  on  a 
common  ftequency  in  an  asynchronous  fashion.  In  general,  it  wiU  be  assumed  that  each  user 
employs  binary  convolutional  coding  on  its  link.  While  it  is  quite  conceivable  that  block 
codes  could  be  used  effectively  on  a  CDMA  Unk,  convolutional  codes  have  the  advantage 
that  they  operate  in  a  sequential  fashion.  Because  the  decoders  that  wiU  be  studied  in  this 
work  are  sequential  in  nature,  the  convolutional  codes  ate  a  much  better  match  to  the 
decoders  than  block  codes.  One  further  assumption  in  this  paper  is  that  each  user  employs 
the  same  convolutional  code,  although  it  is  not  at  aU  difficult  to  generalize  this  wotk  to  the 
case  where  each  user  employs  a  different  code. 

At  each  time  interval  of  length  T,.  the  convolutional  code  is  generated  for  user  k  by 
passing  P  binary  information  bits,  4(n)  =  (ft’Hu) . duough  a  shift  register  consist¬ 

ing  of  W  stages  with  Q  modulo-2  adders.  The  number  of  output  bits  for  each  P-bit  input 
is  2  bits.  The  rate  of  the  code  U  H.  =  P/e  and  tl«  constraint  length  of  the  code  is 
W.  The  output  sequence  of  binary  code  bits  for  the  interval  corresponding  to  input  bits  /t(n) 
is  (Dt«(n),...,Dt<2>(n))-  Note  that  for  W  =  1  and  P  =  Q  =  1,  we  have  the  mcoded  case,  so  in 

that  case  Dk(n)  =  Ik(n). 

In  the  time  interval  [nTsHq-l)T^k>nTs+qT^kl  user  k  transmits  data  bit 
where  t*  represents  the  time  shift  of  the  user  relative  to  some  reference  time,  thus 
accounting  for  the  asynchronism  of  the  users  relative  to  each  other.  Trepresents  tiie  code  bit 
period  and  Tb  -  T/Rc  is  the  information  bit  duration,  thus  Ts^QT-  PTb-  Let  f*  -  m^r+x*. 
tfc  €  [0,7).  and  m*  e  Thus  trikTis  a  coarse  time  shift  and  x*  is  a  fine  time  shift 

for  user  k. 

Each  user  in  the  system  is  assigned  a  particular  signature  sequence,  and  it  will  be 
assumed  that  this  signature  sequence  has  a  duration  equal  to  the  code  bit  interval,  although 
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this  assumption  can  be  relaxed  with  a  change  of  the  notation.  We  will  combine  the  carrier 
and  signature  sequence  into  a  single  signal,  thus  the  it'*  carrier  multiplied  by  the  binary  (±  1) 
signature  sequence,  PN/dt),  will  be  denoted  by 


Sk(t)  = 


VZ/T  PNic{t)  cos  {(Oct) 


0 


0<t^T 

otherwise 


(1) 


The  energy  of  the  user’s  code  bit  measured  at  the  receiver  will  be  denoted  by  It  will 
be  assumed  that  all  K  users  transmit  their  signals  through  a  common  additive  white  Gaussian 
noise  channel  with  two-sided  noise  spectral  density  Nq/I  W/Hz,  and  so  the  received  signal 
will  have  the  following  form 


'■«)=  i  i  Si,(t-nT,-{q-l)T^t)  +  zW 

U=-«*=l  5=1 


(2) 


where  z  (t)  denotes  the  noise. 


Next  we  define  the  partial  cross-correlation  of  the  known  signature  sequences  j  and  k  to 
be: 


Pjkd)  =  j  Sj(t-Xj)  Sk(t-lT-<t)  dt  .  (3) 

It  is  worth  noting  that  Pyy(0)  =  1  and  p;*(/)=pjfcy(-/). 

The  base  station  that  will  be  referred  to  as  the  conventional  base  station  on  this  coded 
link  attempts  to  estimate  the  I:'*  user’s  data  using  only  the  matched  filter  outputs  for  the  Jfc'* 
user.  It  will  be  assumed  that  the  Viterbi  algorithm  operating  on  each  user’s  observed  code 
symbols  is  a  soft- Viterbi  algorithm  having  a  decoding  delay  of  5  information  symbols,  where 
generally  5  will  be  several  times  the  constraint  length,  W,  of  the  code.  The  time  complexity 
per  decoded  bit  for  this  receiver  may  be  estimated  by  considering  the  number  of  metric  com¬ 
putations  per  information  bit  decided.  If  we  define  the  binary  memory  order  of  the  encoder 
to  be  K=  log25  where  S  is  the  number  of  states  of  each  user’s  encoder,  then  there  are 
metrics  computed  for  every  P  bits  decided,  so  TCB  =  0(2*+^/P). 

As  in  the  uncoded  case,  there  are  a  number  of  ways  that  a  multiuser  receiver  can 
operate  to  improve  upon  the  performance  of  the  conventional  basestation.  In  the  next  sec¬ 
tion,  the  optimum  maximum  likelihood  sequence  estimator  will  be  derived  for  this  problem 
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and  analyzed.  Because  this  receiver  has  a  very  high  complexity,  in  [11]  and  [13],  a  parti¬ 
tioned  trellis-based  approach  is  be  introduced,  along  with  a  number  of  multistage  decision 
feedback  approaches  which  all  have  a  lower  complexity  than  the  optimal  sequence  estimator. 

3  Optimum  Sequence  Estimator  For  Rate- 1/2  Convolutional  Codes 

The  optimal  MLSE  will  now  be  derived  for  the  special  case  in  which  each  user  in  the 
network  is  employing  a  rate- 1/2  convolutional  code  with  a  constraint  length  of  W,  so 
2T.  Our  limitation  to  this  special  case  will  facilitate  considerably  the  derivation  of 
the  decoder,  and  it  will  then  be  outlined  how  the  optimal  decoder  can  be  derived  in  a  similar 
way  for  a  general  rate-P/j2  convolutional  code  case. 

To  begin,  it  is  important  to  note  that  the  optimal  sequence  estimator  or  equalizer  for 
multiple-user  uncoded  signals  operates  in  a  "round-robin  fashion  among  all  K  users  in  the 
system,  [1].  This  Viterbi  algorithm  traverses  one  treUis  stage  per  channel  bit  observed.  The 
optimal  sequence  estimator  for  decoding  the  rate- 1/2  code  for  one  of  the  users  in  a  single 
user  environment,  however,  is  a  Viterbi  algorithm  which  requires  two  channel  observations 
from  the  user  of  interest  to  move  ahead  one  stage  in  the  trellis,  [14].  The  rate- 1/2  convolu¬ 
tional  code  can,  however,  be  viewed  not  as  a  code  which  produces  two  binary  bits  per  infor¬ 
mation  bit  period,  Tj,  but  as  an  equivalent  treUis  code  which  produces  one  4-ary  coded 
waveform  every  Ti,  seconds.  By  formulating  the  equalization  problem  at  the  receiver  with 
respect  to  this  super-code-symbol  view  of  the  received  signal,  we  can  accomplish  both  the 
tasks  of  equalization  and  decoding  in  the  same  Viterbi  algorithm.  Because  there  is  only  one 
4-ary  super-symbol  received  for  each  information  bit  that  must  be  decided,  the  decoder  can 
be  formulated  in  basically  the  same  fashion  as  was  used  in  [1]  for  the  MUI  problem  or  [7]  for 
the  ISI  problem. 

We  begin  by  defining  the  following  notation. 


1 

[0,T) 

(4) 

o 

ti 

otherwise 

fl 

te  [T,2T) 

(5) 

o 

II 

CiC 

otherwise 

Next,  define  a  concatenation  of  coded  signals  as 
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Two  of  the  signature-carrier  waveforms  can  likewise  be  concatenated  to  form  a  super- 
signature  waveform. 

Skit-nTi,-Xfc)  =  Skit-nTh-^k)  +  Sk{t-{i-\-Vi)Ti,-^k)  (7) 

This  presumes  that  the  signature  sequence  repeats  every  code  symbol  period.  The  received 
waveform  may  now  be  written  in  terms  of  these  waveforms  of  duration  T^,: 

r(0=  X  Tt^kit~nTi,-<k)Sk(t-nTi,-<k)^^ + z(.t)  (8) 

/=-<»•  jt=l 

This  signal  may  be  viewed  as  a  four-valued  super-code  symbol,  modulat¬ 

ing  a  pair  of  orthonormal  basis  functions  through  the  procedure  defined  above.  The  basis 
functions  in  this  new  view  of  the  waveform  are 

<l>iife(0  =  ^i(0Jt(0  (9) 

and 

^2k(t)=82it)Skit)  (10) 

appropriately  synchronized  with  the  information  bit  periods.  Thus,  this  equivalent  view  of 
the  coding  process  suggests  that  the  information  bits  are  mapped  by  the  encoder  onto 
waveforms  in  a  space  defined  by  and  <{)2;fc(r).  Note  that  although  the  bases  defined  in 
(9)  and  (10)  are  orthonormal,  they  are  not,  in  general,  orthogonal  to  <|>iy(f)  and  ^j(t)  which 
are  the  basis  set  for  another  user  in  the  system,  user  j,  since  Sk(t)  and  Sj(t)  are  not  orthogonal 
in  general.  The  result  when  the  received  signal  is  a  sum  of  K  component  signals  is  MUI.  We 
now  define  four  parameters  which  are  a  measure  of  the  degree  of  correlation  between  the 
basis  functions  of  the  different  users. 


(11) 

<>(0= 

(12) 

M 

(13) 

Xjkil)= 


(14) 
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These  parameters  play  the  same  role  in  the  super-symbol  view  of  the  coded  signal  that  Pjkil) 
plays  for  the  standard  view  of  the  signal.  In  fact,  U,  V,  W  and  X  can  be  related  to  p  directly 
by  substituting  (9)  and  (10)  into  (11)  through  (14). 


Ujkil)  =  Pjk(2l+mk-mj) 


(15a) 


Vjk(.l)  =  Pjki2l+mk-mj) 


(15b) 


Wjk(l)  =  pjk(2l+l+mk-mj) 

Xjk(l)  =  Pjki2l-l+mk-mj) 


(15c) 

(15d) 


Note  that  Ujkil)  =  ^kH)  =  W;,(/)  =  XjkU)  =  0  for  I  / 1  >1;  this  fact  will  play  an  important  role 
in  determining  the  proper  state  description  of  the  system  for  the  optimal  sequence  estimator. 
Some  other  useful  properties  of  the  correlation  parameters  are  Ujkil)-^kjH\ 
Vjkil)  =  Xjkil)  —  Wkj(~l)’ 

Beginning  with  equation  (8),  note  that  by  performing  a  modulo-X  decomposition  of  the 
index  i,  namely  i  =a(/)X+P(i)-l,  and  by  assuming  that  the  K  users  transmit  (2M+l)/K  infor¬ 
mation  bits  each  in  the  time  interval  of  interest  and  that  the  signal  is  zero  outside  of  this 

interval,  we  can  write 

r(r)=  £  Dp(,)(f-a(i)ri,-tp(f))f p(,)(r-a(i)Ti,-^(,-))>/Ep(o  +z(.t)  (16) 

i=^M 


We  now  further  simplify  the  notation  by  defining  the  following  terms, 

Vj«  =  Vp(i)iJ(m)(a(m)-a(t)) 

Xim  =Xp(,)p(„)(oi(m)-a(i)) 


(17) 

(18) 

(19) 

(20) 
(21) 
(22) 


We  have  now  laid  the  foundation  for  the  derivation  of  the  MLSE.  This  development 
wUl  closely  foUow  the  derivaUon  of  the  optimal  MLSE  in  [7]  and  [8]  for  the  uncoded  ISI 

channel. 
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By  expanding  (16)  with  a  Karhunen-Loeve  expansion  and  letting  the  dimensionality  of 
the  expansion  grow,  we  obtain  the  following  waveform  metric: 

A(I>M)  =  -/(r(r)-  2  (23) 

~oo  i=-Af 

We  next  define: 


=  4ii(a(0)  =  /  r  (r)<|)ip(,)(r-<x(i)rfc-tp^,))Jr  (24) 

and 


=  r^)  («(»))  =  /  r  (r)(|)2p(.)(^-a(07fc-^P(,))Jr  (25) 


which  represent  the  outputs  of  a  pair  of  matched  filters  or  correlators  for  the  basis  functions 
for  user  P(i)  at  time  t  =  iaii}+l)Ti,+x^i^.  By  expanding  (23).  and  then  collecting  the 
appropriate  terms  we  get  the  following  metric: 

A(A)  =  A(Di_,)  +  2  [Z)PV^ (rP-j:fD\}}iUu-i+D\^JiWii-i}<E^) 

/=1 

+  DP^  (rP  -Z(D\}}iXii.i+D^^JiVii^}^IF^)]  (26) 

where  D,  represents  the  multiuser  code-symbol  sequence  up  to  time  interval  i,  and  Z,  is  the 
smallest  integer  such  that  for  every  l'>L  we  have  Ujk(a(L'))  =  VjkiaiL'))  =  Wjk(a(L'))  = 

2fyjfc(a(L  ))  =  0.  We  have  already  seen  that  the  correlation  parameters  are  zero  when 
la(L')l>l,soL=A'-l. 

There  are  a  number  of  important  observations  that  can  be  made  from  the  path  metric 
given  in  equation  (26).  First  of  all,  the  stage  metric  depends  only  on  the  code  symbols  in 
the  set 


S  =  {Z>a).Dp).Z>iL\,DP_\.. 


(27) 


along  with  the  matched  filter  outputs,  and  rP\  as  well  as  the  signal  energies  and  correla¬ 
tions.  It  is  possible  to  estimate  the  crosscorrelations  using  the  local  oscillators  and  code  gen¬ 
erators  which  are  assumed  to  be  synchronized  to  the  K  components  of  the  incoming  signal. 
We  can  also  estimate  the  energies,  by  averaging  the  outputs  of  the  matched  filters  for 
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a  number  of  bits.  (The  number  over  which  they  would  be  averaged  would  depend  on  the  rate 
at  which  the  relative  strengths  of  the  users  is  varying.) 

For  any  user,  k,  the  convolutional  encoder  defines  a  mapping  rule  from  the  input  infor¬ 
mation  symbol  and  present  state  of  the  encoder  to  the  code  bits,  and  I>P(n).  If  we 

define  h  (■)  as  the  mapping  rule  from  the  input  information  symbol  and  state  to  the  4-ary 
super-code  symbol,  then  by  substituting  the  information  symbols  that 

define  the  state  of  the  encoder  in  for  the  state,  the  following  expression  may  be  written: 

{Z)ii>(n),  DPin)}  =h[h(n),  4(n-l),...,  kin-W+l)  ]  (28) 

Thus,  in  this  form,  it  is  clear  that  the  4-ary  super-code  symbol  depends  on  only  W  informa¬ 
tion  symbols.  Using  this  information,  it  is  easy  to  redefine  the  set  E  which  was  defined  in 
equation  (27)  in  terms  of  the  information  symbols  which  affect  the  stage  metric. 

S  =  {//.a,}  (29) 

<^i  =  {^i-1 » h-2^  •  •  •  >  h-WK+\  }  (30) 

where  /;  =/p(i)(a(i)).  Thus  it  is  now  apparent  that  the  system  may  be  described  in  terms  of 
2iwir-i  gjates,  since  the  information  symbols  are  binary.  Furthermore,  the  maximum  likeli¬ 
hood  sequence  estimator  can  be  implemented  with  a  Viterbi  algorithm  operating  on  a  trellis 
with  2^“^  states  and  two  branches  per  state.  This  trellis  will  be  cyclically  time-varying  as 
in  the  uncoded  case,  [1].  Furthermore,  it  is  clear  that  this  trellis  reduces  to  the  trellis 
derived  in  [1]  when  the  constraint  length  of  the  code  is  one  (uncoded  transmission  for  each 
user).  Obviously,  the  number  of  states  in  the  MLSE  grows  very  quickly  with  both  the 
number  of  users  in  the  system  and  the  constraint  length  of  the  codes  being  used.  In  fact,  for  a 
simple  4-user  case  where  each  user  uses  a  W  =  3,  or  4-state  code,  the  MLSE  requires  a 
Viterbi  algorithm  operating  on  a  trellis  with  2048  states! 

The  time  complexity  per  bit  decoded  for  the  multiuser  MLSE  is  TCB  =  0  (2’’*)  since 
there  are  2-2 metrics  which  must  be  computed  at  each  stage  of  the  trellis  and  one  infor¬ 
mation  bit  is  decided  at  each  stage.  Note  that  for  the  case  of  W  =  1,  the  TCB  calculated  in 
[1]  is  again  obtained. 

Now  that  the  MLSE  has  been  derived  for  the  rate- 1/2  case,  it  is  straightforward  to  gen¬ 
eralize  to  the  case  of  rate-F/Q  convolutional  codes.  The  function  Dk(t)  will  again  have  to  be 


-9- 

constructed  from  a  set  of  orthogonal  basis  functions.  One  reasonable  choice  would  be  a  set  of 
Q  non-overlapping  pulses,  each  of  duration  T.  Again,  the  function  would  be  con¬ 
structed  from  concatenations  of  Q  versions  of  Th®  metric  derivation  could  then 

proceed  in  the  same  fashion  as  in  the  rate- 1/2  case.  There  are  2^  input  hypotheses  to  test  in 
each  r,  for  each  user,  so  the  overall  trellis  will  have  2^  branches  per  state.  Furthermore,  the 
state  of  the  system  will  be  specified  by  (k+P)(^:-1>+-k  information  bits,  so  it  will  have 
<^+PK-p  states,  where  k=:  log2S  and  S'  is  the  number  of  states  in  the  single  user’s  encoder. 
This  will  result  in  a  TCB  =  0  (2’^^^VP). 

Clearly  the  exponential  dependence  of  the  TCB  on  the  number  of  users,  the  number  of 
states  in  each  of  the  user’s  codes  and  P  makes  the  use  of  the  optimal  decoder  prohibitive  for 
a  realistic  system.  It  is,  however,  an  important  receiver  because  it  represents  the  best  that 
can  be  achieved  in  terms  of  sequence  error  probability,  and  it  will  provide  a  good  baseline  by 
which  to  judge  the  quality  of  suboptimal  schemes.  This  receiver  also  raises  the  possibility  of 
using  a  variety  of  sparse  searching  algorithms  like  a  sequential  decoder  as  was  used  in  [5]  for 
the  uncoded  case,  or  reduced  state  sequence  estimation  techniques  like  the  one  proposed  in 
[4]  for  the  uncoded  MUI  equalization  problem  or  [9]  for  the  combined  equalization  and 
decoding  problem  for  single-user  links  suffering  from  ISI. 

3.1  Performance  of  file  Optimal  Sequence  Estimator 

To  illustrate  the  derivation  of  some  performance  bounds  for  the  MLSE,  we  will  again 
use  the  rate  1/2  code  example.  In  this  analysis  we  will  fairly  closely  follow  the  analysis 
which  appeared  in  [1]  and  I?].  In  keeping  with  [1],  we  consider  the  decoding  window  to 
range  from  the  index  -M  to  the  index  M.  The  goal  of  this  section  is  to  estimate  the  perfor¬ 
mance  of  the  optimal  sequence  estimator  by  bounding  the  finite  and  infinite  horizon  error 
probabilities  for  the  it user  in  the  system,  denoted  P^ifi)  and  P*  =  lim  P^ (n). 

Consider  the  transmission  of  the  sequence  of  super-code  symbols,  D  - 
and  a  competing  sequence  in  the  trellis  D+2?  corresponding  to  the 
sequence  where  e  =  e^)  is  a  sequence  of  code 

error  symbols.  Each  can  take  on  values  in  the  set  G  =  {0,  ±1 }.  Next,  define  the  follow¬ 


ing  sets: 


(31) 
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fi  =  {  7:  n=a(-M),...,a(M), 

k=l,...,K,  q=l,2,  e^\n)^  for  some  n,k,q  } 


A(D)  =  {e:e€B,  D+lee  C  } 

C  =  {D:56/t({7})} 


(32) 

(33) 


where  h  (•)  is  the  mapping  rule  defined  by  the  code  from  an  information  sequence,  /,  to  a 
sequence  of  super-code  symbols,  D,  as  in  (28).  Since  this  mapping  rule  is  a  one-to-one  func¬ 
tion,  it  has  an  inverse.  If  we  define  the  information  error  sequence 

V  =  h"n5+2?)-7  (34) 

which  is  the  information  bit  error  sequence  corresponding  to  D +2?  such  that  if  D  =  (7), 
then  D  -^-le  This  allows  us  to  define 


A^{D,n)  =  {e:eeA (7)),  V;fc(n);*0}  (35) 

so  A^iD,n)  is  the  set  of  admissible  error  sequences  which  affect  the  information  bit  of 
the  k*’*  user.  From  these  definitions,  it  follows  that  the  probability  of  error  for  the  /i**  bit  of 
the  user  is  given  by 


Pf(n)=  E  P 


{A(P+ 2e)>AiD)  I D  sent } 


P^(D  sent) 


(36) 


Dec  IJeAfCD.n) 

As  is  the  usual  approach,  we  choose  to  bound  (36)  with  a  union  bound. 


Pkin)^  L  2  P(A(P+2e)>A{D)\D  sentyP"(D  sent) 

Dec  eeAj^(D,n) 


(37) 


The  event  A(D+2e)  >  A(D)  may  now  be  written  by  expanding  equation  (23)  and  substitut¬ 
ing 

=  ri!?)  (a(/))  =  '  (DJ^) 7/y  +  Df>^Wij)  -I-  (38) 

J=i-K+l 


and 


rP  =  rg?)  (a(/))  =  '^X  ^  (7)5^^ Xij  +  Df^ ^fp^Vij)  +  zp>  (39) 

J=i-K+l 

for  and  rpl  respectively,  where  and  zp^  are  the  noise  variates  at  the  output  of  the 
matched  filters  for  the  basis  functions  and  <{)2p(,)  respectively  for  the  interval  a(i).  After 
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some  algebra,  the  following  expression  for  the  event  A(£) +2e)  >  A(D)  is  obtained. 

i=-Af  m=:-M 

<  2  Vi«(e!»4'>+«P4“)  (40) 


Let  A^(?)  represent  the  left  side  of  equation  (40).  The  right  side  of  equation  (40)  is  a  linear 
combination  of  Gaussian  random  variables,  zP^  and  zP\  It  is  not  difficult  to  show  that 
£  [zi^^]  =  £  [z{^^]  =  0  and  also  that 


E 


No 

2 


(41) 


As  a  result,  if  we  define  y  to  be  the  right  side  of  equation  (40),  then  it  is  not  difficult  to  show 
that  E  [y]=0  and  Var  [y]=  A^(e )  •  Nq/Z. 

Next,  the  two-sequence  error  probability,  or  the  probability  of  the  event  given  in  equa¬ 
tion  (40),  becomes  the  probability  that  the  Gaussian  random  variable,  y,  is  larger  than  the 
threshold,  A^(e).  We  next  define  the  following  efficiency  parameter  for  the  pair  of 
sequences  separated  by  the  code  symbol  error  sequence,  e,  as 


A^(?)  _  A^(e) 
Ebk  ■ 


(42) 


where  E^k  -  2Ek  is  the  energy  per  information  bit  for  user  k.  This  allows  us  to  write 


P (A(D  +2e)>  A(Z>)  I  D  sent)  =  Q 


V  No 

0,  ^ 


(43) 


so  'njt^(?)  is  the  asymptotic  efficiency  relative  to  uncoded  BPSK  transmission  for  the  user 
for  the  pair  of  sequences  D  and  D+2e.  This  can  be  shown  to  reduce  to  the  form  of  the  dis¬ 
tance  measure  in  [1]  for  the  uncoded  system,  because  as  in  [1],  A^(?)  may  also  be  expressed 
as  the  L  2  norm  of  the  signal  generated  by  modulating  the  error  sequence. 

In  order  to  construct  a  lower  bound  on  the  probability  of  error  for  user  k,  we  define  the 
following  minimum  efficiency  as 


TlKn/n(«)  =  mf  inf_  r\^(e) 
DeC  eeA^iD^n) 


(44) 
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so  that 


4 


No 


(45) 


Thus  we  now  have  a  lower  bound  expression  for  P^(n)  given  in  (45),  and  when  (43)  is  sub¬ 
stituted  into  equation  (37)  we  have  an  upper  bound  on  P^(n). 

To  obtain  bounds  for  the  infinite  horizon  error  probabilities  we  may  conclude  exactly  as 
in  [1]  that  the  infinite  horizon  efficiencies  r\k(e)  and  y\k,min  are  achieved  by  finite  length  error 
sequences.  As  a  result,  the  infinite  horizon  error  probability  for  the  user  may  be  lower 
bounded  by 


Pk^P[r\k(^=T\k,min]-Q 


4 


2Etk 

No 


nk,. 


(46) 


Similarly,  by  passing  (37)  to  the  limit  as  M  approaches  infinity 


Pk^ 


_X  X_  PiDsent)-Q 

DeC  eeAi:(D,n) 


(47) 


where  Ak(D,n)  =  lim  Ajf(P,n).  We  should  note  that  (47)  may  not  converge  for  all  noise 

levels.  In  [1],  the  convergence  region  was  increased  by  limiting  the  inner  sum  to  the  set  of 
indecomposable  sequences.  This  solution  perhaps  would  be  of  use  here  as  well  to  obtain  a 
tighter  upper  bound,  however,  we  will  not  focus  on  this  issue  here  because  the  convergence 
of  (47)  will  not  affect  the  rest  of  our  analysis. 

In  the  high  signal-to-noise  ratio  regime,  the  terms  in  (47)  with  the  minimum  efficiency 
will  dominate  the  asymptotic  behavior  of  the  receiver.  As  a  result,  we  will  refer  to  die 
minimum  efficiency,  Tjjt.ifiiB  as  the  asymptotic  multiuser  coding  gain  for  user  k  (^MCG).  The 
AMCG  is  an  efficiency  parameter  which  is  a  measure  of  the  energy  gain  or  loss  of  the 
receiver  relative  to  an  uncoded  BPSK  system  operating  in  isolation  with  an  energy  per  infor¬ 
mation  bit  of  Ebk. 

In  the  limiting  cases  where  there  is  only  JIT  =  1  user  in  the  system,  or  when  there  are  K 
users  in  the  system  with  perfectly  orthogonal  super-signature  sequences,  then  T[k.min  is  the 
asymptotic  coding  gain  (ACG)  of  a  single-user  system  operating  with  the  same  code.  In  the 
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limiting  case  where  the  users  do  not  employ  coding,  Tijt, min  is  equivalent  to  the  asymptotic 
multiuser  efficiency  (AAf£)  obtained  in  [1]  for  the  optimal  multiuser  receiver  for  the  uncoded 
system.  Thus  the  asymptotic  multiuser  coding  gain  unifies  the  asymptotic  coding  gain  and 
the  asymptotic  multiuser  efficiency  parameters. 

The  equation  for  T|j^(e)  may  be  rewritten  in  the  form  of  a  quadratic  form, 


To  do  this,  we  define  the  vector  er  to  be  the  subvector  of  the  infinite  length  error  sequence  e 
which  consists  of  all  of  the  nonzero  components  of  F  and  all  zero  components  of  e  which  are 
surrounded  by  nonzero  components.  If  we  assume  that  the  dimension  of  the  vector  ep  is 
2r  X 1,  and 


then  the  matrix  Hr  is  defined  as  Hp  =  where  the  sub  matrices  are  given  by 


(50) 


Thus,  Hf  has  dimensions  2rx  IT.  Also,  £p  is  a  diagonal  energy  matrix  with  diagonal  ele¬ 
ments  Ejj  =  (Ep(;))^. 

As  an  example,  consider  the  2-user  case  where  each  user  employs  a  rate  1/2,  4-state 
convolutional  code,  as  is  shown  in  Rgure  2.  If  user  1  sends  an  all  zeros  sequence,  and  user  2 
sends  all  zeros  except  for  stage  I'o,  where  a  1  is  sent,  then  a  valid  error  sequence  is 


F6  =  (-1  -1 110-101-1  -111)^. 


(51) 


For  this  case,  assuming  that  m  i  =  m2 


form 


_  10  i>ji(0)  0 

H6~  0  1  Pad)  PJiW 

Pai(0)  Pad)  1  0 

0  Pa<P)  0  1 

0  0  0  pjid) 

0  0  0  0 

0  0  0  0 

0  0  0  0 

0  0  0  0 

0  0  0  0 

0  0  0  0 

0  0  0  0 


=  0sothatXi  =Xi  and X2=t2,  the ^fetnatrix  takes  the 


0  0  0  0  0  0  0  0' 

00000000 
00000000 
Pad)  0  0  0  0  0  0  0 

1  0  paW  0  0  0  0  0 

0  1  Pad)  PJIC))  0  0  0  0 

Pa(0)  Pad)  1  0  0  0  0  0 

0  Pa(0)  0  1  Pad)  0  0  0 

0  0  0  Pad)  1  0  pa(0)  0 

0  0  0  0  0  1  Pad)  Pa(0) 

0  0  0  0  pa(0)  Pad)  1  0 

0  0  0  0  0  pa(0)  0  1  ^ 


(53) 
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and  if  the  users  have  equal  energy,  then  the  effective  efficiency  for  this  error  sequence  is 

Til  (?)  =  Vi  ^6  ^6  =  5  -  5P21  (0)  -  3p2i  (1) 

This  implies  that  for  this  particular  case,  a  necessary  condition  for  the  MLSE  to  have  an 
asymptotic  loss  relative  to  a  single-user  system  is 

Tli(?)  =  5-5p2i(0)-3p2i(l)  <  df/2  (54) 

implying  that  because  the  free  distance  of  the  code  in  use  is  df  =  5,if 

P21  (0)  +  y  P21  (1)  >  (55) 


then  the  MLSE  will  not  achieve  a  single-user  performance  level  as  Nq/I-^O. 
In  the  same  case,  if  the  user’s  energies  are  not  equal. 


This  may  be  considered  to  be  an  upper  bound  on  rj  since  the  minimum  over  all  valid 
error  sequences  is  no  larger  than  the  r\^(ie)  for  a  particular  valid  error  sequence. 

In  general  an  interesting  result  is  obtained  when  we  examine  !!*(?)  for  ?  sequences 
involving  only  single-user  errors.  Note  that  for  every  ?€  Aif(P,n)  such  that  every  nonzero 
element  of  ?  corresponds  to  user  k,  (in  other  words,  only  user  k  is  involved  in  the  error  event) 

1  _  1  vvrjtl?] 

[?]  =  — ^  (57) 

where  vvtl?]  is  the  weight  or  number  of  nonzero  elements  of  ?  (or  equivalently  ?r),  and 
yvt]Se]  is  the  weight  of  user  Jt’s  subsequence  of  ?.  (User  *’s  subsequence  is  the  set  of  all 
{eP,  }  in  ? such  that  P(0  =  k.)  Because 


min  M/f*[?]  df 

eeMD.n)  — r —  =  — 
Dec  ^  ^ 


(58) 


we  have  the  result  that  !!*(?)  ^  df/2  for  every  ?  e  Ak(D,n)  such  that  every  nonzero  element 
of  ?  is  contained  in  user  Jfc’s  subsequence. 

This  result  is  important  because  it  implies  that  single-user  error  events  are  not  responsi¬ 
ble  if  the  AMCG  is  less  than  the  ACG  of  a  single-user  system.  We  thus  must  examine 
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multiple-user  error  events  to  find  T|jt(?)  <  df/2^  which  is  the  ACG  for  a  rate  1/2  convolu¬ 
tional  code. 

In  general,  the  computation  of  'r\k,min  involves  a  search  over  all  eGAk(D,n)  for  each 
DeC  reference  sequence.  Rather  than  attacking  this  problem  directly,  in  this  paper  we  will 
lower  bound  the  worst-case  efficiency  for  the  2-user  situation,  and  will  then  illustrate  some 
nice  properties  of  the  MLSE  using  this  bound. 

By  studying  the  Hj-  matrix  for  this  2-user  case,  we  can  obtain  a  lower  bound  on  the 
result  of  equation  (48)  in  the  following  way.  Every  nonzero  element  of  ?r  will  multiply  its 

— .T 

corresponding  element  of  corresponding  diagonal  element  of  Hj-  and  be  weighted  by 
the  energy  for  that  element  We  thus  have,  as  a  part  of  the  result  of  (48),  the  weight  of  user 
I’s  error  subsequence  multiplied  by  plus  the  weight  of  user  2’s  error  subsequence  multi¬ 
plied  by  £2-  The  remaining  terms  in  the  result  of  (48)  are  due  to  the  product  of  elements  of 
?r  with  other  elements  of  weighted  by  the  off-diagonal  elements  of  Hj-  and  (£i£2)'^*  If 
we  lower  bound  the  sum  of  these  off  diagonal  terms  by  a  number  that  is  smaller  than  is 
achievable  by  the  actual  off-diagonal  terms,  then  we  have  a  lower  bound  on  equation  (48). 
One  possible  lower  bound  on  the  off  diagonal  terms  leads  to  the  following  expression  which 
is  only  a  function  of  the  weight  of  the  error  sequences.  It  turns  out  that  this  expression  is,  in 
most  situations,  a  somewhat  loose  lower  bound  on  T\k(e).  We  will  focus  on  the  performance 
of  user  1  without  any  loss  in  generality. 

Tii(?)^min{/[(E2/£i)‘'^.wfi[?].w/2(?],^.  d/2  )  (59) 

where 

(E2/£,)«(2mm{B-(,(?],»-/2[?])+2)C)  (60) 

and  where  C  =  I P21  (0)  I  +  I P21  (1)1-  The  function  /  (•)  is  a  lower  bound  on  T|  1  (?)  as  long  as 
?  has  wtif?]  >  0  and  >  0.  We  have  already  seen  firom  (57)  and  (58)  that  d/2  is  a 

lower  bound  on  T|i(?)  when  wt2[e]  =  0,  so  the  smaller  of  these  two  expressions  is  less  than 
T|i(?)  for  all  eeAkiD^n). 
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For  a  fixed  set  of  crossconelations  and  signal  energies,  thus  a  constant  C  and  constant 
^£2/ El ,  the  function  /{^Ei/Ei  ,wri[e],  wt2[e]X)  describes  a  family  of  parabolas,  one 
for  each  value  of  wt  1  [?]  and  wf  21?].  It  is  easy  to  show  that 


^(JE2/E{f^,wti^,wt2[!^.^  =fiXE2/Ei)^^,df^dfX 


This  result  impUes  that 

■ni.mw  ^  E 


■  “yZ)  (62) 


which  means  that  we  have  lower  bounded  the  AMCG  by  a  function  which  depends  only  on 
the  user’s  energies,  crosscorrelations  and  the  free  distance  of  the  code.  This  bound  on 
T|i,mm  is  valid  only  for  the  2-user,  rate  1/2  code  case,  but  it  will  illustrate  some  very  impor¬ 
tant  features  of  the  performance  of  the  MLSE  which  should  remain  true  for  the  general  K- 
user,  rate  P/Q  code  cases  as  well.  This  bound  will  illustrate  these  performance  features 
without  requiring  a  solution  to  the  NP-hard  problem  of  searching  for  the  actual  error 
sequence,  e,  and  corresponding  reference  sequence,  D,  which  achieve  the  actual  T\i, min- 

The  first  feature  of  the  bound  in  (62)  may  be  noted  by  examining  the  plot  of  F(-)  as  a 
fimction  ‘^Je^Te^  shown  in  Figure  3  for  ^  =  0.6  and  df=5.  As  the  interfering  signal 
strength,  E2  becomes  small  relative  to  Fi,  F(-)  approaches  the  ACG  of  the  single  user  sys¬ 
tem,  Also,  as  F2  becomes  large  relative  to  ^(‘)  again  reaches  the  ACG  of  a  single  user. 
In  fact,  for 

iE2/Ei)'^  >  2C-^  (63) 


the  MLSE  necessarily  will  have  the  same  asymptotic  performance  as  that  of  a  single-user 
system.  In  fact,  because  F(*)  is  only  a  lower  bound  on  the  AMCG  of  the  receiver,  the  actual 
energy  ratio  above  which  single-user  performance  is  achieved  may  be  significantly  lower 
than  the  threshold  given  in  (63).  This  point  may  be  illustrated  by  the  dotted  line  in  Figure  3 
which  is  the  actual  plot  of  'ni(e)  for  the  e  given  in  equation  (56).  Without  performing  the 
search  for  Tji  we  do  not  know  whether  the  t|i(?)  shown  for  that  particular  F  is  the 


minimum,  but  if  it  is,  then  the  actual  threshold  for  yfE^/Ei  above  which  single-user 
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asymptotic  performance  is  achieved  would  be  0.96. 

Another  interesting  feature  of  the  bound  in  equation  (62)  is  that  it  provides  a  lower 
bound  on  the  near-far  resistance  of  the  MLSE,  which  is  defined  as  the  infimum  of  over 
the  energies  of  the  interfering  users.  [3]  This  infimum  for  the  function  F(-)  is 

2  '"/■  (64) 

[O.o.)  {Ei/Exfe[Q,^)  -9  2  2df 

which  is  positive  for 

C=lp2i(0)l+lpji(l)l  <-^  (65) 

A  strictly  positive  near-far  resistance  implies  that  the  receiver  will  have  an  error  rate  that 
goes  to  zero  at  the  same  exponential  rate  as  a  single-user  system  operating  with  an  energy 
penalty  of 

It  is  also  interesting  to  note  that  as  the  code  which  is  employed  becomes  more  powerful, 
or  as  df  increases,  the  conditions  on  the  crosscorrelations  of  the  users  becomes  progressively 
less  restrictive  to  achieve  near-far  resistance.  In  other  words,  a  stronger  code  allows  the 
MLSE  to  remain  near-far  resistant  on  a  channel  with  more  severe  MUI  than  would  be  possi¬ 
ble  with  a  weaker  code.  Again,  however,  because  F(-)  is  simply  a  lower  bound  on 
the  actual  AMCG  may  be  positive  when  the  minimum  of  F(')  is  not.  Nonetheless,  the  fact 
that  (65)  implies  a  positive  lower  bound  is  an  interesting  feature  of  the  bound  in  (62). 

3.2  Simulation  Results 

To  provide  some  direct  comparisons  between  the  performance  of  the  MLSE  and  the 
conventional  receiver  in  terms  of  bit  error  rate  at  a  moderate  to  low  Ei/Nq,  we  will  use  a 
computer  simulation  for  some  two-user  cases.  Figure  4  shows  the  results  of  a  simulation  of  a 
two-user  system  where  each  user  employs  a  4-state  rate  1/2  convolutional  code.  The  result¬ 
ing  super-trellis  used  by  the  MLSE  has  32  states.  Figure  4  illustrates  a  severe  MUI  environ¬ 
ment  where  Pi2(0)  =  0.3  and  Pi2(-1)  =  0.3.  In  this  case,  the  MLSE  is  able  to  recoup  almost 
all  of  the  loss  that  the  conventional  decoder  suffers  when  compared  with  the  performance  in 
the  single-user  environment  In  Figure  5,  the  same  0.3  channel  is  simulated  for  a  varying 
near-far  energy  ratio.  This  figure  shows  that  the  MLSE  approach  achieves  a  single-user 
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performance  level  for  sufficiently  strong  or  sufficiently  weak  interference.  This  result  is  sup¬ 
ported  by  the  asymptotic  performance  suggested  by  the  bound  in  Figure  3.  In  addition,  equa¬ 
tion  (64)  suggests  that  the  MLSE  is  near-far  resistant  for  this  case,  since  C  =  0.6  and  df=  5. 
Also,  the  upper  bound  on  the  AMCG  in  Figure  3  suggests  that  there  is  not  necessarily  an 
asymptotic  loss  for  the  MLSE  relative  to  the  single-user  performance  level  in  the  equal- 
energy  case  since  the  AMCG  is  upper  bounded  by  2.5  at  an  energy  ratio  of  one.  This  is  sup¬ 
ported  by  the  simulation  in  Figure  4. 

It  is  worth  noting  that  all  of  the  performance  analysis  in  this  paper  has  been  based  upon 
the  metric  for  the  case  where  each  user  in  the  system  employs  rate- 1/2  convolutional  codes. 
The  expression  for  the  distance  and  asymptotic  multiuser  coding  gain  will  be  more  compli¬ 
cated  in  the  general  rate-P/fi  code  case,  but  the  derivation  procedure  will  be  the  same.  Thus 
the  work  in  this  paper  is  meant  to  illustrate  the  general  procedure  for  the  error  analysis  of  the 
more  complex  general  code  rate  case. 


4.  Conclusions 

In  this  paper,  the  maximum  likelihood  sequence  estimator  was  formulated  for  CDMA 
systems  where  each  user  employs  a  convolutional  code  to  improve  its  performance.  It  was 
shown  that  the  complexity  of  the  MLSE  depends  exponentially  on  the  number  of  users  in  the 
system,  the  number  of  states  in  each  user’s  encoder  and  the  number  of  input  information  bits, 
P.  This  high  complexity  points  to  the  use  of  suboptimal  approaches  to  attempt  to  attain  high 
performance  levels  with  a  more  reasonable  complexity,  such  as  reduced  state  sequence  esti¬ 
mation  approaches  [15],  sequential  decoding  approaches  (currently  under  investigation  by 
the  authors  of  [16]),  linear  approaches,  [6]  and  [13],  or  multistage  decision  feedback 
approaches,  [11]  and  [13]. 
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Figure  1  Maximum  Likelihood  Sequence  Estimator  for  a  convolutionally  encoded  CDMA  system 
(CE:  convolutional  encoder) 


Figure  2  Rate  1/2, 4-state  convolutional  code. 


AMCG 


Figure  3  Plot  of  lower  bound  on  for  the  2-user,  P2i(0)  =  p2i  (1)  =  0.3  case  with  each  user  employing 
the  code  shown  in  Hgure  2.  Also  shown  is  the  actual  'ni(?)  for  the  specific  error  event  given  in  equation 
(51). 


’ baverage 


Hgure  4  Performance  curves  of  the  MLSE  (dotted  line)  for  a  2-user  channel  with  Pi2(0)  =  0.3  and 
Pi2(-1)  =  0.3  and  equal  energies.  The  solid  lines  show  a  single  user  system  (no  MUI)  with  and  without  the 
rate- 1/2  4-state  convolutional  code. 


Eb2^bl  (4^) 

Figure  5  Near-far  ratio  performance  curves  of  the  MLSE  on  a  2-user  chatmel  with  pi2(0)  =  0.3  and 
Pj2(-1)  =  0.3  at  Ebi/No  -  2  dB,  The  single-user  system  performance  level  (no  MUI)  with  the  rate- 1/2  4- 
state  convolutional  code  is  shown  as  a  solid  line  and  the  MLSE  performance  is  shown  as  a  dotted  line. 
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